Specials (Unicode block)

Specials
RangeU+FFF0..U+FFFF
(16 code points)
PlaneBMP
ScriptsCommon
Assigned5 code points
Unused9 reserved code points
2 non-characters
Unicode version history
1.0.0 (1991)1 (+1)
2.1 (1998)2 (+1)
3.0 (1999)5 (+3)
Unicode documentation
Code chart ∣ Web page
Note: [1][2]

Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF, containing these code points:

  • U+FFF9 INTERLINEAR ANNOTATION ANCHOR, marks start of annotated text
  • U+FFFA INTERLINEAR ANNOTATION SEPARATOR, marks start of annotating character(s)
  • U+FFFB INTERLINEAR ANNOTATION TERMINATOR, marks end of annotation block
  • U+FFFC OBJECT REPLACEMENT CHARACTER, placeholder in the text for another unspecified object, for example in a compound document.
  • U+FFFD REPLACEMENT CHARACTER used to replace an unknown, unrecognised, or unrepresentable character
  • U+FFFE <noncharacter-FFFE> not a character.
  • U+FFFF <noncharacter-FFFF> not a character.

U+FFFE <noncharacter-FFFE> and U+FFFF <noncharacter-FFFF> are noncharacters, meaning they are reserved but do not cause ill-formed Unicode text. Versions of the Unicode standard from 3.1.0 to 6.3.0 claimed that these characters should never be interchanged, leading some applications to use them to guess text encoding by interpreting the presence of either as a sign that the text is not Unicode. However, Corrigendum #9 later specified that noncharacters are not illegal and so this method of checking text encoding is incorrect.[3] An example of an internal usage of U+FFFE is the CLDR algorithm; this extended Unicode algorithm maps the noncharacter to a minimal, unique primary weight.[4]

Unicode's U+FEFF ZERO WIDTH NO-BREAK SPACE character can be inserted at the beginning of a Unicode text to signal its endianness: a program reading such a text and encountering 0xFFFE would then know that it should switch the byte order for all the following characters.

Its block name in Unicode 1.0 was Special.[5]

  1. ^ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. ^ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. ^ "Corrigendum #9: Clarification About Noncharacters". The Unicode Standard. Archived from the original on Jun 10, 2023. Retrieved 2023-06-07.
  4. ^ "Unicode Technical Standard #35". Unicode Locale Data Markup Language (LDML). Retrieved 2024-08-27.
  5. ^ "3.8: Block-by-Block Charts" (PDF). The Unicode Standard. Version 1.0. Unicode Consortium. Archived (PDF) from the original on 2021-02-11. Retrieved 2020-09-30.