GBK (character encoding)

Guójiā Biāozhǔn Kuòzhǎn (GBK)
Layout of GBK (see below for a larger copy of this diagram)
MIME / IANAGBK
Alias(es)CP936, MS936, windows-936, csGBK
Language(s)Web browsers, decode as GB 18030, supporting all languages, while the encoding (and other software decoders) is primarily used for Simplified Chinese, but also supports Traditional Chinese, Japanese, English, Russian and (partially) Greek.
StandardGBK 1.0
ClassificationExtended ASCII,[a] variable-width encoding, CJK encoding
ExtendsEUC-CN
Preceded byGB 2312
Succeeded byGB 18030
  1. ^ Not in the strictest sense of the term, as ASCII bytes can appear as trail bytes.

GBK is an extension of the GB 2312 character set for Simplified Chinese characters, used in the People's Republic of China. It includes all unified CJK characters found in GB 13000.1-93, i.e. ISO/IEC 10646:1993, or Unicode 1.1. Since its initial release in 1993, GBK has been extended by Microsoft in Code page 936/1386, which was then extended into GBK 1.0. GBK is also the IANA-registered internet name for the Microsoft mapping,[1] which differs from other implementations primarily by the single-byte euro sign at 0x80.

GB abbreviates Guójiā Biāozhǔn, which means national standard in Chinese, while K stands for Extension (扩展 kuòzhǎn). GBK not only extended the old standard GB 2312 with Traditional Chinese characters, but also with Chinese characters that were simplified after the establishment of GB 2312 in 1981. With the arrival of GBK, certain names with characters formerly unrepresentable, like the 镕 (róng) character in former Chinese Premier Zhu Rongji's name, are now representable.[2]

As of October 2022, GBK is the third-most popular encoding served from China and territories (after UTF-8 and the subset GB 2312), with 1.9% of web servers serving a page that declares GBK.[3] However, all major web browsers decode GB2312-marked documents as if they were marked GBK, except for Safari and Edge on the label GB_2312.[4] Together, GBK and GB 2312 encodings have a combined 5.5% presence in China and territories.[3] Globally, GBK accounts for less than 0.07% of all web pages and GBK+GB2312 for 0.2%.[5]

  1. ^ "Character Sets". Retrieved 3 October 2016.
  2. ^ "Code Page 936 - PRC GBK (XGB)". Microsoft. Archived from the original on 2002-10-01. Conversion map between Codepage 936 and Unicode. Need manually selecting GB 18030 or GBK in browser to view it correctly.
  3. ^ a b "Distribution of Character Encodings among websites that use China and territories". w3techs.com. Retrieved 2022-10-25.
  4. ^ "Encoding: Summarized test results". www.w3.org. Retrieved 2019-11-15.
  5. ^ "Historical trends in the usage statistics of character encodings for websites, October 2022". w3techs.com. Retrieved 2022-10-25.