Unicode compatibility characters

In Unicode and the UCS, a compatibility character is a character that is encoded solely to maintain round-trip convertibility with other, often older, standards.[1] As the Unicode Glossary says:

A character that would not have been encoded except for compatibility and round-trip convertibility with other standards[2]

Although compatibility is used in names, it is not marked as a property. However, the definition is more complicated than the glossary reveals. One of the properties given to characters by the Unicode consortium is the characters' decomposition or compatibility decomposition. Over five thousand characters do have a compatibility decomposition mapping that compatibility character to one or more other UCS characters. By setting a character's decomposition property, Unicode establishes that character as a compatibility character. The reasons for these compatibility designations are varied and are discussed in further detail below. The term decomposition sometimes confuses because a character's decomposition can, in some cases, be a singleton. In these cases the decomposition of one character is simply another approximately (but not canonically) equivalent character.

  1. ^ "Chapter 2.3: Compatibility characters" (PDF). The Unicode Standard 6.0.0.
  2. ^ Unicode consortium Unicode Glossary