Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Big5
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Encoding for Traditional Chinese characters}} {{Other uses|Big Five (disambiguation)}} {{Multiple issues| {{More citations needed|date=January 2021}} {{tone|date=June 2013}} {{Lead too short|date=September 2023}} }} {{Infobox character encoding | name = Big5 | mime = Big5 | image = | caption = | alias = Big-5, 大五碼 | by = [[Institute for Information Industry]] | standard = | lang = [[traditional Chinese characters|Traditional Chinese]], [[English language|English]]<br/>'''Partial support:'''<br/>[[Simplified Chinese]], [[Greek language|Greek]], [[Japanese language|Japanese]], [[Russian language|Russian]], [[Bulgarian language|Bulgarian]], some of [[International Phonetic Alphabet|IPA]] letters for phonetic usage.<ref>{{cite web|url=http://ash.jp/code/cn/big5tbl.htm|title=Big5 (Traditional Chinese) character code table|access-date=2007-08-23|archive-date=2002-05-04|archive-url=https://web.archive.org/web/20020504075931/http://ash.jp/code/cn/big5tbl.htm|url-status=dead}}</ref> | status = | extends = [[ASCII]]{{efn|name=ASCII}} | extensions = [[Windows-950]], [[Hong Kong Supplementary Character Set|Big5-HKSCS]], [[#Extensions|numerous others]] | prev = | next = | encodes = | classification = [[Extended ASCII]],{{efn|Not in the strictest sense of the term, as ASCII bytes can appear as trail bytes.}}{{efn|Big5 does not specify a single-byte component; however, ASCII (or an extension) is used in practice.|name=ASCII}} [[variable-width encoding]], [[double-byte character set|DBCS]], [[CJK characters|CJK encoding]] | otherrelated = [[CNS 11643]] | extra = <div style="text-align: left;">{{notelist}}</div> }} '''Big-5''' or '''Big5''' ({{lang-zh|t=大五碼}}) is a [[Chinese character encoding]] method used in [[Taiwan]], [[Hong Kong]], and [[Macau]] for [[traditional Chinese characters]]. The [[People's Republic of China (PRC)]], which uses [[simplified Chinese characters]], uses the [[GB 18030]] character set instead (though it can also substitute Big-5 or UTF-8).{{citation needed|date=June 2025}} Big5 gets its name from the consortium of five companies in Taiwan that developed it.<ref>{{Cite web|title=Character Sets|url=http://chinesemac.org/pages/character_sets.html|access-date=2021-08-31|website=chinesemac.org|archive-date=2017-08-12|archive-url=https://web.archive.org/web/20170812225334/http://chinesemac.org/pages/character_sets.html|url-status=live}}</ref> ==Encoding== The original Big5 character set is sorted first by usage frequency, second by stroke count, lastly by [[List of Kangxi radicals|Kangxi radical]]. The original Big5 character set lacked many commonly used characters. To solve this problem, each vendor developed its own extension. The [[ETen Chinese System|ETen]] extension became part of the current Big5 standard through popularity. The structure of Big5 does not conform to the [[ISO 2022]] standard, but rather bears a certain similarity to the {{nowrap|[[Shift JIS]]}} encoding. It is a [[double-byte character set|double-byte character set (DBCS)]] with the following structure: {| border=1 style="border-collapse: collapse" class="wikitable plainrowheaders" |- ! scope="row"| First byte ("lead byte") | {{mono|0x81}} to {{mono|0xfe}} (or {{mono|0xa1}} to {{mono|0xf9}} for non-user-defined characters) |- ! scope="row"| Second byte | {{mono|0x40}} to {{mono|0x7e}}, {{mono|0xa1}} to {{mono|0xfe}} |} (the prefix 0x signifying hexadecimal numbers). Standard assignments (excluding vendor or user-defined extensions) do not use the bytes {{mono|0x7F}} through {{mono|0xA0}}, nor {{mono|0xFF}}, as either lead (first) or trail (second) bytes. Bytes {{mono|0xA1}} through {{mono|0xFE}} are used for both lead and trail bytes for double-byte (Big5) codes. Bytes {{mono|0x40}} through {{mono|0x7E}} are used as trail bytes following a lead byte, or for single-byte codes otherwise. If the second byte is not in either range, [[unspecified behavior|behavior is unspecified]] (i.e., varies from system to system). Additionally, certain variants of the Big5 character set, for example the [[HKSCS]], use an expanded range for the lead byte, including values in the {{mono|0x81}} to {{mono|0xA0}} range (similar to {{nowrap|Shift JIS}}), whereas others use reduced lead byte ranges (for instance, the Apple Macintosh variant uses {{mono|0xFD}} through {{mono|0xFF}} as single-byte codes, limiting the lead byte range to {{mono|0xA1}} through {{mono|0xFC}}).<ref name="mactradchinese">{{citation|mode=cs1|url=https://unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINTRAD.TXT|title=Map (external version) from Mac OS Chinese Traditional encoding to Unicode 3.0 and later.|author=Apple, Inc|author-link=Apple, Inc|publisher=[[Unicode Consortium]]|date=2005-04-04|orig-year=1996-06-31|access-date=2021-02-24|archive-date=2021-05-14|archive-url=https://web.archive.org/web/20210514182521/https://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINTRAD.TXT|url-status=live}}</ref> The numerical value of individual Big5 codes are frequently given as a 4-digit hexadecimal number, which describes the two bytes that comprise the Big5 code as if the two bytes were a [[big endian]] representation of a 16-bit number. For example, the Big5 code for a full-width space, which are the bytes {{mono|0xa1}} {{mono|0x40}}, is usually written as {{mono|0xa140}} or just A140. Strictly speaking, the Big5 encoding contains only DBCS characters. However, in practice, the Big5 codes are always used together with an unspecified, system-dependent [[SBCS|single-byte character set (SBCS)]] (such as [[ASCII]] or [[code page 437]]), so that Big5-encoded text contains a mix of double-byte characters and single-byte characters. Bytes in the range {{mono|0x00}} to {{mono|0x7f}} that are not part of a double-byte character are assumed to be single-byte characters. (For a more detailed description of this problem, please see the discussion on "The Matching SBCS" below.) The meaning of non-ASCII single bytes outside the permitted values that are not part of a double-byte character varies from system to system. In old MSDOS-based systems, they are likely to be displayed as 8-bit characters; in modern systems, they are likely to either give unpredictable results or generate an error. ===A more detailed look at the organization=== In the original Big5, the encoding is compartmentalized into different zones: {| class="wikitable" |- | {{mono|0x8140}} to {{mono|0xA0FE}}|| Reserved for user-defined characters 造字 |- | {{mono|0xA140}} to {{mono|0xA3BF}}|| "Graphical characters" 圖形碼 |- | {{mono|0xA3C0}} to {{mono|0xA3FE}}|| Reserved, ''not'' for user-defined characters |- | {{mono|0xA440}} to {{mono|0xC67E}}|| Frequently used characters 常用字 |- | {{mono|0xC6A1}} to {{mono|0xC8FE}}|| Reserved for user-defined characters |- | {{mono|0xC940}} to {{mono|0xF9D5}}|| Less frequently used characters 次常用字 |- | {{mono|0xF9D6}} to {{mono|0xFEFE}}|| Reserved for user-defined characters |} The "graphical characters" actually comprise punctuation marks, partial punctuation marks (e.g., half of a dash, half of an ellipsis; see below), [[dingbat]]s, foreign characters, and other special characters (e.g., presentational "full width" forms, digits for [[Suzhou numerals]], [[bopomofo|zhuyin fuhao]], etc.) In most vendor extensions, extended characters are placed in the various zones reserved for user-defined characters, each of which are normally regarded as associated with the preceding zone. For example, additional "graphical characters" (e.g., punctuation marks) would be expected to be placed in the {{mono|0xa3c0}}–{{mono|0xa3fe}} range, and additional logograms would be placed in either the {{mono|0xc6a1}}–{{mono|0xc8fe}} or the {{mono|0xf9d6}}–{{mono|0xfefe}} range. Sometimes, this is not possible due to the large number of extended characters to be added; for example, [[Cyrillic]] letters and Japanese [[kana]] have been placed in the zone associated with "frequently-used characters". ===Duplicates=== Big5 has encoded two duplicate characters: "兀" on 0xA461 (U+5140) and 0xC94A (U+FA0C), "嗀" on 0xDCD1 (U+55C0) and 0xDDFC (U+FA0D). Some encoding mapping also maps the three Suzhou numerals, "〸", "〹" and "〺", in the graphical section to ideograph characters (U+5341, U+5344 and U+5345 respectively)<ref>{{Cite web|title=Unicode CP950 mapping file|url=https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT|website=Unicode|publisher=[[Unicode Consortium]]|access-date=2023-05-11|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627235611/https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT|url-status=live}}</ref><ref>{{Cite web|title=Unicode Big5 mapping file|url=https://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT|website=Unicode|publisher=[[Unicode Consortium]]|access-date=2023-05-11|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627235404/https://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT|url-status=live}}</ref> instead of [[CJK Symbols and Punctuation]] (U+3038, U+3039 and U+303A respectively).<ref>{{Cite web|title=Mozilla 系列與 Big5 中文字碼(Big5-2003)|url=https://moztw.org/docs/big5/table/big5_2003-b2u.txt|website=Mozilla 台湾社群|lang=zh-TW|access-date=2020-07-01|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627234452/https://moztw.org/docs/big5/table/big5_2003-b2u.txt|url-status=live}}</ref><ref>The ETEN mapping file provided by Mozilla Taiwan community maps the three characters to both the symbol and ideograph codepoint. {{Cite web|title=Mozilla 系列與 Big5 中文字碼(ETEN)|url=https://moztw.org/docs/big5/table/eten.txt|website=Mozilla 台湾社群|lang=zh-TW|access-date=2020-07-01|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627234353/https://moztw.org/docs/big5/table/eten.txt|url-status=live}}</ref> ===What a Big5 code actually encodes=== An individual Big5 code does not always represent a complete semantic unit. The Big5 codes of logograms are always logograms, but codes in the "graphical characters" section are not always complete "graphical characters". What Big5 encodes are particular graphical representations of characters or part of characters that happen to fit in the space taken by two monospaced ASCII characters. This is a property of [[CJK characters|CJK]] double-byte character sets, and is not a unique problem of Big5. (The above might need some explanation by putting it in historical perspective, as it is ''theoretically'' incorrect: Back when text mode personal computing was still the norm, characters were normally represented as single bytes and each character takes one position on the screen. There was therefore a practical reason to insist that double-byte characters must take up two positions on the screen, namely that off-the-shelf, American-made software would then be usable without modification in a DBCS-based system. If a character can take an arbitrary number of screen positions, software that assumes that one ''byte'' of text takes one screen position would produce incorrect output. Of course, if a computer never had to deal with the text screen, the manufacturer would not enforce this artificial restriction; the Apple Macintosh is an example. Nevertheless, the encoding itself must be designed so that it works correctly on text-screen-based systems.) To illustrate this point, consider the Big5 code {{mono|0xa14b}} (…). To English speakers this looks like an ellipsis and the Unicode standard identifies it as such; however, in Chinese, the ellipsis consists of six dots that fit in the space of two Chinese characters (……), so in fact there is no Big5 code for the Chinese ellipsis, and the Big5 code {{mono|0xa14b}} just represents half of a Chinese ellipsis. It represents only half of an ellipsis because the whole ellipsis should take the space of two Chinese characters, and in many DBCS systems one DBCS character must take exactly the space of one Chinese character. Characters encoded in Big5 do not always represent things that can be readily used in plain text files; an example is "citation mark" ({{mono|0xa1ca}}, ﹋), which is, when used, required to be typeset under the title of literary works. Another example is the Suzhou numerals, which is a form of [[scientific notation]] that requires the number to be laid out in a 2-D form consisting of at least two rows. ===The Matching SBCS=== In practice, Big5 cannot be used without a matching SBCS; this is mostly to do with a compatibility reason. However, as in the case of other CJK DBCS character sets, the SBCS to use has never been specified. Big5 has always been defined as a DBCS, though when used it must be paired with a suitable, ''unspecified'' SBCS and therefore used as what some people call a [[Variable-width encoding|MBCS]]; nevertheless, Big5 by itself, as defined, is strictly a DBCS. The SBCS to use being unspecified implies that the SBCS used can theoretically vary from system to system. Nowadays, ASCII is the only possible SBCS one would use. However, in old [[MS-DOS|DOS]]-based systems, [[code page 437]]—with its extra special symbols in the control code area including position 127—was much more common. Yet, on a Macintosh system with the Chinese Language Kit, or on a Unix system running the cxterm terminal emulator, the SBCS paired with Big5 would not be code page 437. Outside the valid range of Big5, the old DOS-based systems would routinely interpret things according to the SBCS that is paired with Big5 on that system. In such systems, characters 127 to 160, for example, were very likely not avoided because they would produce invalid Big5, but used because they would be valid characters in code page 437. The modern characterization of Big5 as an MBCS consisting of the DBCS of Big5 plus the SBCS of ASCII is therefore historically incorrect and potentially flawed, as the choice of the matching SBCS was, and theoretically still is, quite independent of the flavour of Big5 being used. ==History== The inability of ASCII to support large [[CJK characters|Chinese, Japanese and Korean (CJK)]] character sets led to governments and industry to find creative solutions to enable their languages to be rendered on computers. A variety of ad hoc and usually proprietary input methods led to efforts to develop a standard system. As a result, Big5 encoding was defined by the [[Institute for Information Industry]] of Taiwan in 1984. The name "Big5" is in recognition that the standard emerged from collaboration of five of Taiwan's largest IT firms: * [[Acer Inc.|Acer]] ([[:zh:宏碁|宏碁]]) * [[MiTAC]] (神通) * JiaJia (佳佳) * ZERO ONE Technology (零壹 or [http://www.01tech.com/ 01tech]) * [[First International Computer|First International Computer (FIC)]] (大眾) Big5 was rapidly popularized in Taiwan and worldwide among Chinese who used the traditional Chinese character set through its adoption in several commercial software packages, notably the [[E-TEN]] Chinese [[DOS]] input system ([[ETen Chinese System]]). The [[Republic of China]] government declared '''Big5''' as their standard in mid-1980s since it was, by then, the ''de facto'' standard for using traditional Chinese on computers. ==Extensions== The original Big-5 only include CJK logograms from the [[Chart of Standard Forms of Common National Characters|Charts of Standard Forms]] of Common National Characters (4808 characters) and Less-Than-Common National Characters (6343 characters), but not letters from people's names, place names, dialects, [[chemistry]], [[biology]], and Japanese kana. As a result, many Big-5 supporting programs include extensions to address the problems. The plethora of variations make [[UTF-8]] (or [[UTF-16]] or the Chinese [[GB 18030]] standard, which is also a full Unicode Transformation Format, i.e. not only for simplified Chinese) a more consistent code page for modern use. ===Vendor extensions=== ====ETen extensions==== In the [[ETen Chinese System|ETen]] (倚天) Chinese operating system, the following code points are added, to add support for some characters present in the [[IBM 5550]]'s code page but absent from generic Big5: * <code>0xA3C0</code>–<code>0xA3E0</code>: 33 control characters. * <code>0xC6A1</code>–<code>0xC875</code>: circle 1–10, bracket 1–10, [[Roman numerals]] 1–9 (i–ix), CJK radical glyphs, Japanese [[hiragana]], Japanese [[katakana]], [[Cyrillic]] characters * <code>0xF9D6</code>–<code>0xF9FE</code>: the characters '[[wikt:碁|碁]]', '[[wikt:銹|銹]]', '[[wikt:恒|恒]]', '[[wikt:裏|裏]]', '[[wikt:墻|墻]]', '[[wikt:粧|粧]]' and '[[wikt:嫺|嫺]]', followed by 34 additional [[semigraphic]] symbols. In some versions of ETen, there are extra graphical symbols and [[simplified Chinese characters]]. ====Microsoft code pages==== {{Main|Code page 950}} [[Microsoft]] (微軟) created its own version of Big5 extension as [[code page 950]] for use with [[Microsoft Windows]], which supports the F9D6–F9FE code points from ETEN's extensions. In some versions of Windows, the [[euro]] [[Euro sign|currency symbol]] is mapped to Big-5 code point A3E1. After installing Microsoft's [http://www.microsoft.com/hk/hkscs/default.aspx HKSCS patch] on top of traditional Chinese Windows (or any version of Windows 2000 and above with proper language pack), applications using code page 950 automatically use a hidden code page 951 table. The table supports all code points in HKSCS-2001, except for the compatibility code points specified by the standard.<ref>{{Cite web|url=http://me.abelcheung.org/2006/09/12/what-is-cp951/|title=狗爺語錄 » Blog Archive » What is Code Page 951 (CP951)?<!-- Bot generated title -->|access-date=2006-09-27|archive-url=https://web.archive.org/web/20070222070938/http://me.abelcheung.org/2006/09/12/what-is-cp951/|archive-date=2007-02-22|url-status=dead}}</ref> ====IBM code pages==== In contrast to Microsoft's code page 950, IBM's [[CCSID]] 950 comprises single byte code page 1114 (CCSID 1114) and double byte code page 947 (CCSID 947).<ref name="ccsid950">{{cite web|title=CCSID 950 information document|archive-url=https://web.archive.org/web/20141202001630/http://www-01.ibm.com/software/globalization/ccsid/ccsid950.html|archive-date=2014-12-02|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid950.html}}</ref><ref>{{cite web|title=CCSID 1114 information document|archive-url=https://web.archive.org/web/20160327100728/http://www-01.ibm.com/software/globalization/ccsid/ccsid1114.html|archive-date=2016-03-27|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid1114.html}}</ref><ref>{{cite web|title=CCSID 947 information document|archive-url=https://web.archive.org/web/20141201232116/http://www-01.ibm.com/software/globalization/ccsid/ccsid947.html|archive-date=2014-12-01|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid947.html}}</ref> It incorporates ETEN extensions for lead bytes {{mono|0xA3}},<ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=A3&s=ALL#layout|title=Lead byte A3: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref> {{mono|0xC6}},<ref name="rfc1922">{{citation|mode=cs1|id=<nowiki>RFC 1922</nowiki>|title=Chinese Character Encoding for Internet Messages|url=https://tools.ietf.org/html/rfc1922|first1=HF.|last1=Zhu|first2=DY.|last2=Hu|first3=ZG.|last3=Wang|first4=TC.|last4=Kao|first5=WCH.|last5=Chang|first6=M.|last6=Crispin|publisher=[[IETF]]|work=Requests for Comments|date=1996|doi=10.17487/rfc1922|doi-access=|access-date=2022-01-01|archive-date=2021-04-29|archive-url=https://web.archive.org/web/20210429165037/https://tools.ietf.org/html/rfc1922|url-status=live|url-access=subscription}}</ref><ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=C6&s=ALL#layout|title=Lead byte C6: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref> {{mono|0xC7}}<ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=C7&s=ALL#layout|title=Lead byte C7: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref> and {{mono|0xC8}},<ref name="rfc1922"/><ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=C8&s=ALL#layout|title=Lead byte C8: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref> while omitting those with lead byte {{mono|0xF9}} (which Microsoft includes), mapping them instead to the [[Private Use Area]] as user-defined characters.<ref name="rfc1922"/><ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=F9&s=ALL#layout|title=Lead byte F9: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref> It also includes two non-ETEN extension regions with trail bytes {{mono|0x81–A0}}, i.e. outside the usual Big5 trail byte range but similar to the Big5+ trail byte range: area 5 has lead bytes {{mono|0xF2–F9}} and contains IBM-selected characters, while area 9 has lead bytes {{mono|0x81–8C}} and is a user-defined region.<ref>{{cite web|url=https://public.dhe.ibm.com/as400/products/clientaccess/win32/files/globalization/T_Chinese_big51999.pdf|title=IBM Traditional Chinese Graphic Character Set for IBM BIG-5 Code|publisher=[[IBM]]|year=1999|id=C-H 3-3220-131 1999-04|access-date=2022-01-01|archive-date=2021-11-22|archive-url=https://web.archive.org/web/20211122211354/https://public.dhe.ibm.com/as400/products/clientaccess/win32/files/globalization/T_Chinese_big51999.pdf|url-status=live}}</ref> IBM refers to the euro sign update of their Big-5 variant as CCSID 1370, which includes both single-byte ({{mono|0x80}}) and double-byte ({{mono|0xA3E1}}) euro signs.<ref name="ccsid1370">{{cite web|title=CCSID 1370 information document|archive-url=https://web.archive.org/web/20160327104212/http://www-01.ibm.com/software/globalization/ccsid/ccsid1370.html|archive-date=2016-03-27|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid1370.html}}</ref> It comprises single byte code page 1114 (CCSID 5210) and double byte code page 947 (CCSID 21427).<ref name="ccsid1370"/><ref>{{cite web|title=CCSID 5210 information document|archive-url=https://web.archive.org/web/20141129231704/http://www-01.ibm.com/software/globalization/ccsid/ccsid5210.html|archive-date=2014-11-29|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid5210.html}}</ref><ref>{{cite web|title=CCSID 21427 information document|archive-url=https://web.archive.org/web/20160327035914/http://www-01.ibm.com/software/globalization/ccsid/ccsid21427.html|archive-date=2016-03-27|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid21427.html}}</ref> For better compatibility with Microsoft's variant in [[IBM Db2]], IBM also define the pure double-byte code page 1372<ref>{{cite web|archive-url=https://web.archive.org/web/20160317015819/http://www-01.ibm.com/software/globalization/cp/cp01372.html|archive-date=2016-03-17|url=http://www-01.ibm.com/software/globalization/cp/cp01372.html|url-status=dead|title=CPGID 01372: MS T-Chinese Big-5 (Special for DB2)|work=IBM Globalization - Code page identifiers}}</ref> and the associated variable-width CCSID 1373, which corresponds to Microsoft's code page 950.<ref>{{cite web|url=http://icu-project.org/icu-bin/convexp?conv=ibm-1373_P100-2002|title=ibm-1373_P100-2002|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]|access-date=2022-01-01|archive-date=2021-05-26|archive-url=https://web.archive.org/web/20210526173749/https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-1373_P100-2002|url-status=live}}</ref> IBM assigns CCSID 5471 to the HKSCS-2001 Big5 code page (with CPGID 1374 as CCSID 5470 as the double byte component),<ref>{{cite web|archive-url=https://web.archive.org/web/20141129233053/http://www-01.ibm.com/software/globalization/ccsid/ccsid5471.html|archive-date=2014-11-29|url-status=dead|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid5471.html|work=IBM Globalization - Coded character set identifiers|publisher=[[IBM]]|title=CCSID 5471: Mixed Big-5 ext for HKSCS-2001}}</ref><ref>{{Citation|title=International Components for Unicode (ICU), ibm-5471_P100-2006.ucm|url=https://github.com/unicode-org/icu/blob/master/icu4c/source/data/mappings/ibm-5471_P100-2006.ucm|date=2007-05-09|access-date=2022-01-01|archive-date=2023-08-13|archive-url=https://web.archive.org/web/20230813202101/https://github.com/unicode-org/icu/blob/main/icu4c/source/data/mappings/ibm-5471_P100-2006.ucm|url-status=live}}</ref> CCSID 9567 to the HKSCS-2004 code page (with CPGID 1374 as CCSID 9566 as the double byte component),<ref>{{cite web|archive-url=https://web.archive.org/web/20141129212819/http://www-01.ibm.com/software/globalization/ccsid/ccsid9567.html|archive-date=2014-11-29|url-status=dead|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid9567.html|work=IBM Globalization - Coded character set identifiers|publisher=[[IBM]]|title=CCSID 9567: Mixed Big-5 ext for HKSCS-2004}}</ref> and CCSID 13663 to the HKSCS-2008 code page (with CPGID 1374 as CCSID 13662 as the double byte component),<ref>{{cite web|archive-url=https://web.archive.org/web/20141129213320/http://www-01.ibm.com/software/globalization/ccsid/ccsid13663.html|archive-date=2014-11-29|url-status=dead|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid13663.html|work=IBM Globalization - Coded character set identifiers|publisher=[[IBM]]|title=CCSID 13663: Mixed Big-5 ext for HKSCS-2008}}</ref> while CCSID 1375 is assigned to a growing HKSCS code page, currently equivalent to CCSID 13663.<ref>{{cite web|archive-url=https://web.archive.org/web/20141129231410/http://www-01.ibm.com/software/globalization/ccsid/ccsid1375.html|archive-date=2014-11-29|url-status=dead|url=http://www-01.ibm.com/software/globalization/ccsid/ccsid1375.html|work=IBM Globalization - Coded character set identifiers|publisher=[[IBM]]|title=CCSID 1375: Mixed Big-5 ext for HKSCS}}</ref> ====ChinaSea font==== [[ChinaSea]] fonts (中國海字集)<ref>{{cite web|author=黃國書|url=http://ftp.isu.edu.tw/pub/Windows/Chinese/font/chinasea/csw10_exp.txt|title=Chinasea 1.0 中國海字集|publisher=ISU FTP|access-date=2016-12-05|archive-url=https://web.archive.org/web/20050319032136/http://ftp.isu.edu.tw/pub/Windows/Chinese/font/chinasea/csw10_exp.txt|archive-date=2005-03-19|url-status=dead}}</ref> are Traditional Chinese fonts made by ChinaSea. The fonts are rarely sold separately, but are bundled with other products, such as the Chinese version of [[Microsoft Office 97]]. The fonts support Japanese kana, [[kokuji]], and other characters missing in Big-5. As a result, the ChinaSea extensions have become more popular than the government-supported extensions.{{as of?|date=May 2019}} Some Hong Kong [[Bulletin board system|BBSes]] had used encodings in ChinaSea fonts before the introduction of HKSCS. ===='Sakura' font==== The [https://web.archive.org/web/20060904023817/http://input.foruto.com/jptxt/ 'Sakura' font] (日和字集 Sakura Version) is developed in Hong Kong and is designed to be compatible with HKSCS. It adds support for [[kokuji]] and proprietary dingbats (including [[Doraemon]]) not found in HKSCS. ====Unicode-at-on==== Unicode-at-on ([[:zh:Unicode補完計畫|Unicode補完計畫]]), formerly BIG5 extension, extends BIG-5 by altering code page tables, but uses the ChinaSea extensions starting with version 2. However, with the bankruptcy of ChinaSea, late development, and the increasing popularity of HKSCS and [[Unicode]] (the project is not compatible with HKSCS), the success of this extension is limited at best. Despite the problems, characters previously mapped to Unicode Private Use Area are remapped to the standardized equivalents when exporting characters to Unicode format. ====OPG==== The web sites of the [[Oriental Daily News]] and [[Sun Daily]], belonging to the [[Oriental Press Group Limited]] (東方報業集團有限公司) in Hong Kong, used a downloadable font with a different Big-5 extension coding than the HKSCS. ===Official extensions=== ====Taiwan Ministry of Education font==== The Taiwan Ministry of Education supplied its own font, the Taiwan Ministry of Education font (臺灣教育部造字檔) for use internally. ====Taiwan Council of Agriculture font==== Executive Yuan introduced a 133-character custom font, the Taiwan Council of Agriculture font (臺灣農委會常用中文外字集), that includes 84 characters from the [[Radical 195|fish radical]] and 7 from the [[Radical 196|bird radical]]. ====Big5+==== The [[Chinese Foundation for Digitization Technology]] (中文數位化技術推廣委員會) introduced Big5+ in 1997, which used over 20000 code points to incorporate all CJK logograms in Unicode 1.1. However, the extra code points exceeded the original Big-5 definition (Big5+ uses high byte values 81-FE and low byte values 40-7E and 80-FE), preventing it from being installed on Microsoft Windows without new codepage files. ====Big-5E==== To allow Windows users to use custom fonts, the Chinese Foundation for Digitization Technology introduced Big-5E, which added 3954 characters (in three blocks of code points: 8E40-A0FE, 8140-86DF, 86E0-875C) and removed the Japanese kana from the ETEN extension. Unlike Big-5+, Big5E extends Big-5 within its original definition. [[Mac OS X 10.3]] and later supports Big-5E in the fonts LiHei Pro (儷黑 Pro.ttf) and LiSong Pro (儷宋 Pro.ttf). ====Big5-2003==== The Chinese Foundation for Digitization Technology made a Big5 definition and put it into [[CNS 11643]] in note form, making it part of the official standard in Taiwan. Big5-2003 incorporates all Big-5 characters introduced in the 1984 ETEN extensions (code points A3C0-A3E0, C6A1-C7F2, and F9D6-F9FE) and the Euro symbol. Cyrillic characters were not included because the authority claimed CNS 11643 does not include such characters. ====CDP==== The [[Academia Sinica]] made a Chinese Data Processing font (漢字構形資料庫) in late 1990s, which the latest release version 2.5 included 112,533 characters, some less than the [[Mojikyo]] fonts. ====HKSCS==== {{Main|Hong Kong Supplementary Character Set}} [[Hong Kong]] also adopted Big5 for character encoding. However, [[written Cantonese]] has its own characters not available in the normal Big5 character set. To solve this problem, the [[Hong Kong Government]] created the Big5 extensions [[Government Chinese Character Set]] (GCCS) in 1995 and [[Hong Kong Supplementary Character Set]] in 1999. The Hong Kong extensions were commonly distributed as a patch. It is still being distributed as a patch by Microsoft, but a full Unicode font is also available from the Hong Kong Government's web site. There are two encoding schemes of HKSCS: one encoding scheme is for the Big-5 coding standard and the other is for the [[ISO 10646]] standard. Subsequent to the initial release, there are also HKSCS-2001 and HKSCS-2004. The HKSCS-2004 is aligned technically with the ISO/IEC 10646:2003 and its Amendment 1 published in April 2004 by the International Organization for Standardization (ISO). HKSCS includes all the characters from the common ETen extension, plus some characters from simplified Chinese, place names, people's names, and Cantonese phrases (including [[Cantonese profanity|profanity]]). {{as of|2020}}, the most recent edition of HKSCS is HKSCS-2016; however, the last edition of HKSCS to encode all of its characters in Big5 was HKSCS-2008, while the characters added in more recent editions are mapped to ISO 10646 / [[Unicode]] only (as a [[CJK Unified Ideographs]] horizontal glyph extension where appropriate).<ref name="irgn2430">{{cite web|url=https://appsrv.cse.cuhk.edu.hk/~irg/irg/irg53/IRGN2430.pdf|title=Submission of Macao's Vertical Extension (UNC Characters), Horizontal Extension, and IVSes Registration for MSCS|author=Macao Special Administrative Region Government|date=2020-06-11|id=[[ISO/IEC JTC 1/SC 2]]/WG 2 [[Ideographic Research Group|IRGN]] 2430|access-date=2020-07-02|archive-date=2020-06-23|archive-url=https://web.archive.org/web/20200623040740/https://appsrv.cse.cuhk.edu.hk/~irg/irg/irg53/IRGN2430.pdf|url-status=live}}</ref> Additionally, similarly to Hong Kong's situation, there are also characters that are needed by Macao but is neither included in Big5 nor HKSCS, hence, the ''Macao Supplementary Character Set'' was developed, comprising characters not found in Big5 or HKSCS; this, however, is also not encoded in Big5. The first batch of 121 MSCS characters were submitted for inclusion in or mapping to Unicode in 2009,<ref name="irgn1580">{{cite web|url=http://appsrv.cse.cuhk.edu.hk/~irg/irg/irg32/IRGN1580MacaoCharsFromMISCS.pdf|title=Submission of Characters from Macao Information Systems Character Set|author=Computer Chinese Characters Encoding Workgroup|date=2009-06-12|id=[[ISO/IEC JTC 1/SC 2]]/WG 2 [[Ideographic Research Group|IRGN]] 1580|archive-url=https://web.archive.org/web/20150104014324/http://appsrv.cse.cuhk.edu.hk/~irg/irg/irg32/IRGN1580MacaoCharsFromMISCS.pdf|archive-date=2015-01-04|url-status=dead}}</ref> and the first final version of MSCS was established in 2020.<ref name="irgn2430"/> ==Kana and Cyrillic== There are two major Big5 extension layouts for encoding kana, [[Russian alphabet|Russian Cyrillic]] and list markers in the range 0xC6A1 through 0xC875. These are not compatible with one another.<ref>{{citation|mode=cs1|url=http://users.monash.edu/~jwb/cjk.inf|section=2.3.1: BIG FIVE|title=CJK.INF Version 2.1|last=Lunde|first=Ken|author-link=Ken Lunde|date=1996-07-12|access-date=2020-03-15|archive-date=2021-05-15|archive-url=https://web.archive.org/web/20210515173205/https://users.monash.edu/~jwb/cjk.inf|url-status=live}}</ref> They are compared in the table below. The ETEN layout of kana and Cyrillic is also used by the HKSCS<ref>{{cite web|url=https://moztw.org/docs/big5/table/hkscs2004.txt|title=Big5HKSCS-2004|publisher=Mozilla Taiwan|access-date=2020-07-01|archive-date=2020-09-24|archive-url=https://web.archive.org/web/20200924084810/http://moztw.org/docs/big5/table/hkscs2004.txt|url-status=live}}</ref> (including [[HTML5]])<ref>{{cite web|url=https://encoding.spec.whatwg.org/big5.html|title=big5|work=Encoding Standard|last=van Kesteren|first=Anne|author-link=Anne van Kesteren|publisher=[[WHATWG]]|access-date=2020-03-15|archive-date=2020-05-04|archive-url=https://web.archive.org/web/20200504184159/https://encoding.spec.whatwg.org/big5.html|url-status=live}}</ref> and Unicode-At-On<ref>{{cite web|url=https://moztw.org/docs/big5/table/uao241-b2u.txt|title=UAO 2.41 b2u|publisher=Mozilla Taiwan|access-date=2020-07-01|archive-date=2020-10-24|archive-url=https://web.archive.org/web/20201024032150/http://moztw.org/docs/big5/table/uao241-b2u.txt|url-status=live}}</ref> variants, as well as by IBM's version of code page 950,<ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=C6&s=ALL#layout|title=Lead byte C6: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref><ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=C7&s=ALL#layout|title=Lead byte C7: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref><ref>{{cite web|url=https://demo.icu-project.org/icu-bin/convexp?conv=ibm-950_P110-1999&b=C8&s=ALL#layout|title=Lead byte C8: ibm-950_P110-1999|work=ICU Demonstration - Converter Explorer|publisher=[[International Components for Unicode]]}}</ref> and the ETEN layout of the kana (with Cyrillic omitted) is also used by the Big5-2003 variant.<ref>{{cite web|url=https://moztw.org/docs/big5/table/big5_2003-b2u.txt|title=Big5-2003 b2u|publisher=Mozilla Taiwan|access-date=2020-07-01|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627234452/https://moztw.org/docs/big5/table/big5_2003-b2u.txt|url-status=live}}</ref> The published mapping files for [[Windows-950]] include neither, and this Big5 range is mapped to the [[Private Use Area]] by the Windows-950 implementation from [[International Components for Unicode]].<ref>{{cite web|url=https://opensource.apple.com/source/ICU/ICU-59180.0.1/icuSources/data/mappings/windows-950-2000.ucm|date=2002-12-03|author1=IBM|author-link1=IBM|author2=Unicode Consortium|author-link2=Unicode Consortium|title=windows-950-2000|work=[[International Components for Unicode]]|access-date=2020-07-01|archive-date=2020-07-02|archive-url=https://web.archive.org/web/20200702020605/https://opensource.apple.com/source/ICU/ICU-59180.0.1/icuSources/data/mappings/windows-950-2000.ucm|url-status=live}}</ref> [[Python (programming language)|Python]]'s built-in {{code|cp950}} codec implementation is using the BIG5.TXT layout.<ref>{{Cite web|url=https://onlinegdb.com/pM_GMuuab|title=Script showing output of cp950 codec for lead bytes 0xC6 and 0xC7|access-date=2022-10-18|archive-date=2022-10-18|archive-url=https://web.archive.org/web/20221018142645/https://www.onlinegdb.com/pM_GMuuab|url-status=live}}</ref> The [[classic Mac OS]] version includes neither layout.<ref name="mactradchinese"/> {|class="wikitable collapsible collapsed" style="border:none" |- !scope="col"| Big5 codes 0xC6A1 through 0xC875 |- |style="padding:0;border:none"| {|class="wikitable" |- !Big5 code!!BIG5.TXT layout<ref>{{citation|mode=cs1|url=https://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT|title=BIG5 to Unicode table (complete)|author=Unicode Consortium|author-link=Unicode Consortium|date=2015-12-02|orig-year=1994-02-11|access-date=2020-03-15|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627235404/https://unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/OTHER/BIG5.TXT|url-status=live}}</ref>!!ETEN layout<ref>{{cite web|url=https://moztw.org/docs/big5/table/eten.txt|date=2002-02-24|title=Big5-ETen vs Unicode mapping table|publisher=Mozilla Taiwan|access-date=2020-07-01|archive-date=2023-06-27|archive-url=https://web.archive.org/web/20230627234353/https://moztw.org/docs/big5/table/eten.txt|url-status=live}}</ref> |- |0xC6A1||ヾ||① |- |0xC6A2||ゝ||② |- |0xC6A3||ゞ||③ |- |0xC6A4||々||④ |- |0xC6A5||ぁ||⑤ |- |0xC6A6||あ||⑥ |- |0xC6A7||ぃ||⑦ |- |0xC6A8||い||⑧ |- |0xC6A9||ぅ||⑨ |- |0xC6AA||う||⑩ |- |0xC6AB||ぇ||⑴ |- |0xC6AC||え||⑵ |- |0xC6AD||ぉ||⑶ |- |0xC6AE||お||⑷ |- |0xC6AF||か||⑸ |- |0xC6B0||が||⑹ |- |0xC6B1||き||⑺ |- |0xC6B2||ぎ||⑻ |- |0xC6B3||く||⑼ |- |0xC6B4||ぐ||⑽ |- |0xC6B5||け||ⅰ |- |0xC6B6||げ||ⅱ |- |0xC6B7||こ||ⅲ |- |0xC6B8||ご||ⅳ |- |0xC6B9||さ||ⅴ |- |0xC6BA||ざ||ⅵ |- |0xC6BB||し||ⅶ |- |0xC6BC||じ||ⅷ |- |0xC6BD||す||ⅸ |- |0xC6BE||ず||ⅹ |- |0xC6BF||せ||丶 |- |0xC6C0||ぜ||丿 |- |0xC6C1||そ||亅 |- |0xC6C2||ぞ||亠 |- |0xC6C3||た||冂 |- |0xC6C4||だ||冖 |- |0xC6C5||ち||冫 |- |0xC6C6||ぢ||勹 |- |0xC6C7||っ||匸 |- |0xC6C8||つ||卩 |- |0xC6C9||づ||厶 |- |0xC6CA||て||夊 |- |0xC6CB||で||宀 |- |0xC6CC||と||巛 |- |0xC6CD||ど||⼳ |- |0xC6CE||な||广 |- |0xC6CF||に||廴 |- |0xC6D0||ぬ||彐 |- |0xC6D1||ね||彡 |- |0xC6D2||の||攴 |- |0xC6D3||は||无 |- |0xC6D4||ば||疒 |- |0xC6D5||ぱ||癶 |- |0xC6D6||ひ||辵 |- |0xC6D7||び||隶 |- |0xC6D8||ぴ||¨ |- |0xC6D9||ふ||ˆ |- |0xC6DA||ぶ||ヽ |- |0xC6DB||ぷ||ヾ |- |0xC6DC||へ||ゝ |- |0xC6DD||べ||ゞ |- |0xC6DE||ぺ||〃 |- |0xC6DF||ほ||仝 |- |0xC6E0||ぼ||々 |- |0xC6E1||ぽ||〆 |- |0xC6E2||ま||〇 |- |0xC6E3||み||ー |- |0xC6E4||む||[ |- |0xC6E5||め||] |- |0xC6E6||も||✽ |- |0xC6E7||ゃ||ぁ |- |0xC6E8||や||あ |- |0xC6E9||ゅ||ぃ |- |0xC6EA||ゆ||い |- |0xC6EB||ょ||ぅ |- |0xC6EC||よ||う |- |0xC6ED||ら||ぇ |- |0xC6EE||り||え |- |0xC6EF||る||ぉ |- |0xC6F0||れ||お |- |0xC6F1||ろ||か |- |0xC6F2||ゎ||が |- |0xC6F3||わ||き |- |0xC6F4||ゐ||ぎ |- |0xC6F5||ゑ||く |- |0xC6F6||を||ぐ |- |0xC6F7||ん||け |- |0xC6F8||ァ||げ |- |0xC6F9||ア||こ |- |0xC6FA||ィ||ご |- |0xC6FB||イ||さ |- |0xC6FC||ゥ||ざ |- |0xC6FD||ウ||し |- |0xC6FE||ェ||じ |- |0xC740||エ||す |- |0xC741||ォ||ず |- |0xC742||オ||せ |- |0xC743||カ||ぜ |- |0xC744||ガ||そ |- |0xC745||キ||ぞ |- |0xC746||ギ||た |- |0xC747||ク||だ |- |0xC748||グ||ち |- |0xC749||ケ||ぢ |- |0xC74A||ゲ||っ |- |0xC74B||コ||つ |- |0xC74C||ゴ||づ |- |0xC74D||サ||て |- |0xC74E||ザ||で |- |0xC74F||シ||と |- |0xC750||ジ||ど |- |0xC751||ス||な |- |0xC752||ズ||に |- |0xC753||セ||ぬ |- |0xC754||ゼ||ね |- |0xC755||ソ||の |- |0xC756||ゾ||は |- |0xC757||タ||ば |- |0xC758||ダ||ぱ |- |0xC759||チ||ひ |- |0xC75A||ヂ||び |- |0xC75B||ッ||ぴ |- |0xC75C||ツ||ふ |- |0xC75D||ヅ||ぶ |- |0xC75E||テ||ぷ |- |0xC75F||デ||へ |- |0xC760||ト||べ |- |0xC761||ド||ぺ |- |0xC762||ナ||ほ |- |0xC763||ニ||ぼ |- |0xC764||ヌ||ぽ |- |0xC765||ネ||ま |- |0xC766||ノ||み |- |0xC767||ハ||む |- |0xC768||バ||め |- |0xC769||パ||も |- |0xC76A||ヒ||ゃ |- |0xC76B||ビ||や |- |0xC76C||ピ||ゅ |- |0xC76D||フ||ゆ |- |0xC76E||ブ||ょ |- |0xC76F||プ||よ |- |0xC770||ヘ||ら |- |0xC771||ベ||り |- |0xC772||ペ||る |- |0xC773||ホ||れ |- |0xC774||ボ||ろ |- |0xC775||ポ||ゎ |- |0xC776||マ||わ |- |0xC777||ミ||ゐ |- |0xC778||ム||ゑ |- |0xC779||メ||を |- |0xC77A||モ||ん |- |0xC77B||ャ||ァ |- |0xC77C||ヤ||ア |- |0xC77D||ュ||ィ |- |0xC77E||ユ||イ |- |0xC7A1||ョ||ゥ |- |0xC7A2||ヨ||ウ |- |0xC7A3||ラ||ェ |- |0xC7A4||リ||エ |- |0xC7A5||ル||ォ |- |0xC7A6||レ||オ |- |0xC7A7||ロ||カ |- |0xC7A8||ヮ||ガ |- |0xC7A9||ワ||キ |- |0xC7AA||ヰ||ギ |- |0xC7AB||ヱ||ク |- |0xC7AC||ヲ||グ |- |0xC7AD||ン||ケ |- |0xC7AE||ヴ||ゲ |- |0xC7AF||ヵ||コ |- |0xC7B0||ヶ||ゴ |- |0xC7B1||Д||サ |- |0xC7B2||Е||ザ |- |0xC7B3||Ё||シ |- |0xC7B4||Ж||ジ |- |0xC7B5||З||ス |- |0xC7B6||И||ズ |- |0xC7B7||Й||セ |- |0xC7B8||К||ゼ |- |0xC7B9||Л||ソ |- |0xC7BA||М||ゾ |- |0xC7BB||У||タ |- |0xC7BC||Ф||ダ |- |0xC7BD||Х||チ |- |0xC7BE||Ц||ヂ |- |0xC7BF||Ч||ッ |- |0xC7C0||Ш||ツ |- |0xC7C1||Щ||ヅ |- |0xC7C2||Ъ||テ |- |0xC7C3||Ы||デ |- |0xC7C4||Ь||ト |- |0xC7C5||Э||ド |- |0xC7C6||Ю||ナ |- |0xC7C7||Я||ニ |- |0xC7C8||а||ヌ |- |0xC7C9||б||ネ |- |0xC7CA||в||ノ |- |0xC7CB||г||ハ |- |0xC7CC||д||バ |- |0xC7CD||е||パ |- |0xC7CE||ё||ヒ |- |0xC7CF||ж||ビ |- |0xC7D0||з||ピ |- |0xC7D1||и||フ |- |0xC7D2||й||ブ |- |0xC7D3||к||プ |- |0xC7D4||л||ヘ |- |0xC7D5||м||ベ |- |0xC7D6||н||ペ |- |0xC7D7||о||ホ |- |0xC7D8||п||ボ |- |0xC7D9||р||ポ |- |0xC7DA||с||マ |- |0xC7DB||т||ミ |- |0xC7DC||у||ム |- |0xC7DD||ф||メ |- |0xC7DE||х||モ |- |0xC7DF||ц||ャ |- |0xC7E0||ч||ヤ |- |0xC7E1||ш||ュ |- |0xC7E2||щ||ユ |- |0xC7E3||ъ||ョ |- |0xC7E4||ы||ヨ |- |0xC7E5||ь||ラ |- |0xC7E6||э||リ |- |0xC7E7||ю||ル |- |0xC7E8||я||レ |- |0xC7E9||①||ロ |- |0xC7EA||②||ヮ |- |0xC7EB||③||ワ |- |0xC7EC||④||ヰ |- |0xC7ED||⑤||ヱ |- |0xC7EE||⑥||ヲ |- |0xC7EF||⑦||ン |- |0xC7F0||⑧||ヴ |- |0xC7F1||⑨||ヵ |- |0xC7F2||⑩||ヶ |- |0xC7F3||⑴||А |- |0xC7F4||⑵||Б |- |0xC7F5||⑶||В |- |0xC7F6||⑷||Г |- |0xC7F7||⑸||Д |- |0xC7F8||⑹||Е |- |0xC7F9||⑺||Ё |- |0xC7FA||⑻||Ж |- |0xC7FB||⑼||З |- |0xC7FC||⑽||И |- |0xC7FD||(not used)||Й |- |0xC7FE||(not used)||К |- |0xC840||(not used)||Л |- |0xC841||(not used)||М |- |0xC842||(not used)||Н |- |0xC843||(not used)||О |- |0xC844||(not used)||П |- |0xC845||(not used)||Р |- |0xC846||(not used)||С |- |0xC847||(not used)||Т |- |0xC848||(not used)||У |- |0xC849||(not used)||Ф |- |0xC84A||(not used)||Х |- |0xC84B||(not used)||Ц |- |0xC84C||(not used)||Ч |- |0xC84D||(not used)||Ш |- |0xC84E||(not used)||Щ |- |0xC84F||(not used)||Ъ |- |0xC850||(not used)||Ы |- |0xC851||(not used)||Ь |- |0xC852||(not used)||Э |- |0xC853||(not used)||Ю |- |0xC854||(not used)||Я |- |0xC855||(not used)||а |- |0xC856||(not used)||б |- |0xC857||(not used)||в |- |0xC858||(not used)||г |- |0xC859||(not used)||д |- |0xC85A||(not used)||е |- |0xC85B||(not used)||ё |- |0xC85C||(not used)||ж |- |0xC85D||(not used)||з |- |0xC85E||(not used)||и |- |0xC85F||(not used)||й |- |0xC860||(not used)||к |- |0xC861||(not used)||л |- |0xC862||(not used)||м |- |0xC863||(not used)||н |- |0xC864||(not used)||о |- |0xC865||(not used)||п |- |0xC866||(not used)||р |- |0xC867||(not used)||с |- |0xC868||(not used)||т |- |0xC869||(not used)||у |- |0xC86A||(not used)||ф |- |0xC86B||(not used)||х |- |0xC86C||(not used)||ц |- |0xC86D||(not used)||ч |- |0xC86E||(not used)||ш |- |0xC86F||(not used)||щ |- |0xC870||(not used)||ъ |- |0xC871||(not used)||ы |- |0xC872||(not used)||ь |- |0xC873||(not used)||э |- |0xC874||(not used)||ю |- |0xC875||(not used)||я |- |} |} ==See also== * [[Unicode]] * [[Han unification]] * [[Chinese input methods for computers]] ==References== {{reflist}} *{{cite book|last=Lunde|first=Ken|year=1999|title=CJKV Information Processing|edition=First|publisher=O'Reilly and Associates, Inc.|isbn=978-1-56592-224-2|url-access=registration|url=https://archive.org/details/cjkvinformationp00lund}} ==External links== *[https://www.unicode.org/irg/docs/n2774-BigFiveSpecification.pdf A scan of the Big5 specification] provided by the [[Ideographic Research Group]] *[http://moztw.org/docs/big5/ Mozilla and the Big5 Family of Encodings]: an overview of Big5 encodings with code charts for each extension and relevant Firefox bugs <small>(Traditional Chinese)</small> *[http://ash.jp/code/cn/big5tbl.htm Big5 character code table] {{Webarchive|url=https://web.archive.org/web/20020504075931/http://ash.jp/code/cn/big5tbl.htm|date=2002-05-04}} *[https://web.archive.org/web/20041012135645/http://kura.hanazono.ac.jp/paper/codes.html Chinese character codes: an update] by Christian Wittern *[http://www.cns11643.gov.tw/AIDB/welcome.do CNS 11643 official web site] has information about the Big5e character set (an extended version of Big5) in the "Chinese Information Code" section. *[http://www.cns11643.gov.tw/web/big5/ Big5 introduction] Contains differences between extensions. *[http://demo.icu-project.org/icu-bin/convexp?conv=Big5 Graphical View of Big5 in ICU's Converter Explorer] *[http://depart.moe.edu.tw/ED2400/News_Content.aspx?n=8940E5C0456177C3&sms=893AAA1CBFE149DE&s=DFBE7BE3EE0DB6AE 教育部標準字體] Download page of the Taiwan Ministry of Education fonts *[http://www.sinica.edu.tw/~cdp/ 文獻處理實驗室] Download pages of the CDP font *[http://www.ogcio.gov.hk/en/business/tech_promotion/ccli/hkscs/ Hong Kong Supplementary Character Set Info] Downloadable HKSCS documents & font *[http://glyph.iso10646hk.net/chinese/download_001.jsp 香港參考宋體] Download page of Dynalab(華康科技有限公司)'s HKSCS font. *[https://msdn.microsoft.com/en-us/library/cc195006.aspx Microsoft's Windows Codepage 950] (Traditional Chinese Big5) *[http://on.cc/orimain/orisunfaq/hkfonts_bottom.html on.cc] Download page of the OPG font *[http://www.mimosapudica.org/ChinaSea/index_c.html 中國海字集視窗版(v3.0)下載網頁] Download page of the ChinaSea font *[https://web.archive.org/web/20150612031232/http://www.b-t.asia/chinese/big5.php Big5 Codeset Overview] {{Character encoding}} [[Category:Chinese character encodings]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:As of
(
edit
)
Template:As of?
(
edit
)
Template:Character encoding
(
edit
)
Template:Citation
(
edit
)
Template:Citation needed
(
edit
)
Template:Cite book
(
edit
)
Template:Cite web
(
edit
)
Template:Code
(
edit
)
Template:Infobox character encoding
(
edit
)
Template:Lang-zh
(
edit
)
Template:Main
(
edit
)
Template:Mono
(
edit
)
Template:Multiple issues
(
edit
)
Template:Nowrap
(
edit
)
Template:Other uses
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Webarchive
(
edit
)