GB 2312

GB 2312
MIME / IANAGB_2312-80 (GB2312 for usual EUC form)
Alias(es)iso-ir-58, chinese, csGB2312, csISO58GB231280
Language(s)Simplified Chinese, English
Partial support:
Traditional Chinese, Russian, Bulgarian, Greek, Japanese, Italian, Irish, Māori
StandardGB/T 2312-1980
ClassificationISO-2022-compatible DBCS, CJK encoding
ExtensionsISO-IR-165
Encoding formats
Preceded byChinese telegraph code
Succeeded byGBK, GB 18030
Other related encoding(s)JIS X 0208, KS X 1001

GB/T 2312-1980 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. GB refers to the Guobiao standards (国家标准), whereas the T suffix (推荐; tuījiàn; 'recommendation') denotes a non-mandatory standard.[1]

GB/T 2312-1980 was originally a mandatory national standard designated GB 2312-1980. However, following a National Standard Bulletin of the People's Republic of China in 2017, GB 2312 is no longer mandatory, and its standard code is modified to GB/T 2312-1980.[2] GB/T 2312-1980 has been superseded by GBK and GB 18030, which include additional characters, but GB/T 2312 remains in widespread use as a subset of those encodings.

As of September 2022, GB2312 is the second-most popular encoding served from China and territories (after UTF-8), with 5.5% of web servers serving a page declaring it.[3] Globally, GB2312 is declared on 0.1% of all web pages.[4] However, all major web browsers decode GB2312-marked documents as if they were marked with the superset GBK encoding, except for Safari and Edge on the label GB_2312.[5]

There is an analogous character set known as GB/T 12345 Code of Chinese ideogram set for information interchange supplementary set, which supplements GB/T 2312 with traditional character forms by replacing simplified forms in their qūwèi code, and some extra 62 supplemental characters.[6][7] GB-encoded fonts often come in pairs, one with the GB/T 2312 (simplified) character set and the other with the GB/T 12345 (traditional) character set. There exists more GB supplementary encoding sets that supplements GB/T 2312, including GB/T 7589 Code of Chinese ideograms set forinformation interchange--The 2nd supplementary set and GB/T 7590 Code of Chinese ideograms set forinformation interchange--The 4th supplementary set which provides additional [Variant Chinese characters|variant characters] in the same qūwèi encoding format (later used in ISO-2022-CN), but has no relation with characters encoded in GB/T 2312.

  1. ^ Lunde, Ken (2009). CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing (2nd ed.). Sebastopol, CA: O'Reilly. pp. 94–111. ISBN 978-0-596-51447-1.
  2. ^ "2017年第7号中国国家标准公告 (China National Standard Bulletin 2017 No.7)". Standardization Administration of the People's Republic of China. Retrieved 3 July 2018.
  3. ^ "Distribution of Character Encodings among websites that use China and territories". w3techs.com. Retrieved 2022-09-04.
  4. ^ "Historical trends in the usage statistics of character encodings for websites, October 2022". w3techs.com. Retrieved 2022-10-01.
  5. ^ "Encoding: Summarized test results". www.w3.org. Retrieved 2019-11-15.
  6. ^ Lunde, Ken (1998). "Appendix F: GB/T 12345". CJKV Information Processing (PDF). O'Reilly Media. ISBN 9781565922242.
  7. ^ GB12345-80 to Unicode table. Unicode Consortium. 1993-12-06. Archived from the original on 2004-06-17.