Unicode Normalization Explained: NFC, NFD, NFKC, NFKD
4 min read
The same text can have multiple Unicode representations. Normalization converts text to a standard form for reliable comparison.
The Four Forms
| Form | Name | Use Case |
|---|---|---|
| NFC | Composed | Most common — recommended for web and storage |
| NFD | Decomposed | Used by macOS file system |
| NFKC | Compatibility Composed | Search and matching |
| NFKD | Compatibility Decomposed | Stripping accents |
Why It Matters
"café" can be encoded as 4 characters (NFC: é as single code point) or 5 characters (NFD: e + combining accent). Without normalization, string comparison fails.
Remove Accents
Use our plain text converter with "Remove accents/diacritics" to normalize text.