28 Jul, 2025
I was recently working on better support for .properties files in Jar.Tools. Probably every Java developer has worked with this format and knows that it's a simple one. So did I until I started implementing it. This is a kind of list of quirks and interesting cases I faced during this journey.
There are three separators (and one of them is whitespace)
Most people think .properties means key=value. In reality:
- key=value
- key:value
- key␠value (one or more spaces or tabs)
All three are valid. That means the following are different lines with the same meaning:
What I validate
Missing separator: if a non‑comment, non‑blank line has no =, :, or whitespace separator, that’s an error.
Empty key: a line that’s just = or : (or just whitespace before value) is an error for an empty key.
=value # ⟵ error: empty key :value # ⟵ error: empty key
What is allowed
Explicit empty values are fine with any separator:
empty.key= empty.key: empty.key␠All three parse as empty.key with an empty string value.
Continuations: odd vs even backslashes, and trailing whitespace
A line ending with a continuation backslash \ joins with the next line. This is where bugs hide:
- Odd number of trailing backslashes → continuation.
- Even number → the last backslash is escaped, so no continuation.
Trailing whitespace matters
A backslash followed by trailing spaces still behaves as a continuation marker in practice. If the file ends right after that whitespace (no next line), it’s a broken continuation error.
Multiline values done right
When parsed, this becomes a single value:
Duplicates are subtle (case‑sensitive keys)
I treat keys as case‑sensitive and flag all occurrences when the same key appears multiple times:
All three lines receive a warning that includes the index of every occurrence (e.g., “Duplicate key ‘duplicate.key’ found at: line 2, line 5, line 8”). By contrast:
Only the two myKey entries get flagged; MyKey is distinct.
Why warn and not error? Real configs sometimes rely on “last one wins,” but it’s almost never intentional. A warning keeps you honest without breaking builds.
Unicode: \uXXXX escapes, surrogate pairs, and “garbage‑in” behavior
Properties files support \uXXXX escapes. That opens a whole Unicode can: invalid lengths, non‑hex digits, surrogate pairs for emoji, and “unknown” escapes.
Invalid escape sequences
Things like \u123 or \u12G4 show up in the wild. I parse them gracefully—no exceptions—and keep values as close as possible to what the user typed. The validator focuses on not crashing; it doesn’t over‑correct malformed text.
Surrogate pairs for emoji
Escaped emoji like \uD83D\uDE80 (🚀) decode correctly. In UTF‑8 mode I emit a warning (“Unicode escape sequence detected”) because direct Unicode is usually clearer. In ISO‑8859‑1 mode, escapes are often necessary, so I emit no warning.
Standard escapes “just work”
The usual suspects decode as expected:
- \t, \n, \r, \f, \\
- escaped separators and specials: \ , \:, \=, \#, \!
Unknown single‑letter escapes like \q or \z are treated literally (the backslash disappears, the letter stays). Again: avoid surprising the user.
Encoding modes: UTF‑8 vs ISO‑8859‑1
Historically, Java treated .properties as Latin‑1 (ISO‑8859‑1), with \uXXXX for anything beyond that range. Many modern tools use UTF‑8. To make intent explicit, I let the validator run in either mode.
ISO‑8859‑1 mode
Error on characters outside Latin‑1.
unicode.chinese=你好世界 # error (outside ISO-8859-1) unicode.emoji=🎉🚀 # error valid.iso=café # fine (é is Latin‑1)\uXXXX for Latin‑1 letters like \u00e9 (é) is allowed and not warned.
UTF‑8 mode
- Direct Unicode is preferred and not warned.
- \uXXXX escapes are warned as unnecessary (but still decoded). That includes escapes for ASCII: \u0041 → “A” with a warning.
Pick the mode that matches your runtime, and you’ll get the right balance of errors vs. guidance.
Comments and structure: preserve intent, don’t rewrite history
Lines starting with # or ! are comments. During validation, I:
- Attach leading comments to the next property as leadingComments.
- Keep raw text for each entry exactly as read.
- Do not escape or normalize anything during validation.
During formatting, I:
Preserve comments as‑is.
Add a consistent key = value spacing.
Escape =, :, and spaces inside values so the output remains parsable:
# original key=value with = and : chars # formatted key = value with \= and \: chars
This “no touching during validation” rule prevents a whole class of “the linter changed my config” surprises.
Lines that look empty… but aren’t
A sneaky category:
A line that’s only = or : → empty key error.
A line that’s key␠␠␠ → a valid key with an explicit empty value (whitespace is the separator).
Whitespace around separators with empty values is fine:
A practical checklist (aka mini‑linter rules)
Flag lines with no =, :, or whitespace separator (error).
Flag empty keys (error) but allow explicit empty values.
Handle continuation logic: odd vs even trailing backslashes; treat trailing whitespace after a continuation backslash as continuation; error if EOF cuts it off.
Treat keys as case‑sensitive; warn on duplicates and list all occurrences.
Decode standard escapes; treat unknown escapes literally without crashing.
Support UTF‑8 and ISO‑8859‑1 modes:
- UTF‑8: warn on \uXXXX as unnecessary.
- ISO‑8859‑1: error on out‑of‑range chars; allow \uXXXX freely.
Keep validation read‑only; do formatting in a separate step.
Preserve comments and attach them to following entries for context.
Represent multiline values as a single logical value; track start/end lines for tooling.
Closing thoughts
I was planning to be done with .properties files validation in few days tops, but after one week of debugging I realized, that even though it looks simple, real‑world examples mixes legacy encoding rules, permissive separators, escape sequences, and multiline values. I will not touch this format again :)
.png)


