Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Han unification
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=={{anchor|Examples of language dependent characters}}Examples of language-dependent glyphs== In each row of the following table, the same character is repeated in all six columns. However, each column is marked (by the <code>lang</code> attribute) as being in a different language: [[Chinese language|Chinese]] ([[Simplified Chinese character|simplified]] and two types of [[Traditional Chinese character|traditional]]), [[Japanese language|Japanese]], [[Korean language|Korean]], or [[Vietnamese language|Vietnamese]]. The [[Web browser|browser]] should select, for each character, a [[glyph]] (from a font) suitable to the specified language. (Besides actual character variation—look for differences in stroke order, number, or direction—the typefaces may also reflect different typographical styles, as with serif and non-serif alphabets.) This only works for fallback glyph selection if you have CJK fonts installed on your system and the font selected to display this article does not include glyphs for these characters. <!-- NOTICE TO EDITORS: Do not replace characters here with their simplified or tradition variants. This table is designed to demonstrate the treatment of identical characters under different HTML locale settings. It is not created to show the difference between traditional and simplified Chinese. --> {| class="wikitable" style="text-align: center; font: xx-large/normal sans-serif; margin:0.5em auto;" |-style="font-size: medium" ! rowspan="2" | Code point ! Chinese <br />(simplified) ! Chinese <br />(traditional) ! Chinese <br />(traditional,<br/>Hong Kong) ! Japanese ! Korean ! Vietnamese ! rowspan="2" | English |-style="font-size: medium" ! <code>zh-Hans</code> ! <code>zh-Hant</code> ! <code>zh-Hant-HK</code> ! <code>ja</code> ! <code>ko</code> ! <code>vi-Hani</code> |- |style="font-size: medium; font-family: sans-serif;" |{{U+|4ECA}} | lang="zh-Hans" |今 | lang="zh-Hant" |今 | lang="zh-Hant-HK" |今 | lang="ja" |今 | lang="ko" |今 | lang="vi" |{{vi-nom|今}} |style="font-size: medium; font-family: sans-serif;" | now |- |style="font-size: medium; font-family: sans-serif;" |U+4EE4 | lang="zh-Hans" |令 | lang="zh-Hant" |令 | lang="zh-Hant-HK" |令 | lang="ja" |令 | lang="ko" |令 | lang="vi" |{{vi-nom|令}} |style="font-size: medium; font-family: sans-serif;" | cause/command |- |style="font-size: medium; font-family: sans-serif;" |U+514D | lang="zh-Hans" |免 | lang="zh-Hant" |免 | lang="zh-Hant-HK" |免 | lang="ja" |免 | lang="ko" |免 | lang="vi" |{{vi-nom|免}} |style="font-size: medium; font-family: sans-serif;" | exempt/spare |- |style="font-size: medium; font-family: sans-serif;" |U+5165 | lang="zh-Hans" |入 | lang="zh-Hant" |入 | lang="zh-Hant-HK" |入 | lang="ja" |入 | lang="ko" |入 | lang="vi" |{{vi-nom|入}} |style="font-size: medium; font-family: sans-serif;" | enter |- |style="font-size: medium; font-family: sans-serif;" |U+5168 | lang="zh-Hans" |全 | lang="zh-Hant" |全 | lang="zh-Hant-HK" |全 | lang="ja" |全 | lang="ko" |全 | lang="vi" |{{vi-nom|全}} |style="font-size: medium; font-family: sans-serif;" | all/total |- |style="font-size: medium; font-family: sans-serif;" |U+5173 | lang="zh-Hans" |关 | lang="zh-Hant" |关 | lang="zh-Hant-HK" |关 | lang="ja" |关 | lang="ko" |关 | lang="vi" |{{vi-nom|关}} |style="font-size: medium; font-family: sans-serif;" | close (simplified) / laugh (traditional) |- |style="font-size: medium; font-family: sans-serif;" |U+5177 | lang="zh-Hans" |具 | lang="zh-Hant" |具 | lang="zh-Hant-HK" |具 | lang="ja" |具 | lang="ko" |具 | lang="vi" |{{vi-nom|具}} |style="font-size: medium; font-family: sans-serif;" | tool |- |style="font-size: medium; font-family: sans-serif;" |U+5203 | lang="zh-Hans" |刃 | lang="zh-Hant" |刃 | lang="zh-Hant-HK" |刃 | lang="ja" |刃 | lang="ko" |刃 | lang="vi" |{{vi-nom|刃}} |style="font-size: medium; font-family: sans-serif;" | knife edge |- |style="font-size: medium; font-family: sans-serif;" |U+5316 | lang="zh-Hans" |化 | lang="zh-Hant" |化 | lang="zh-Hant-HK" |化 | lang="ja" |化 | lang="ko" |化 | lang="vi" |{{vi-nom|化}} |style="font-size: medium; font-family: sans-serif;" | transform/change |- |style="font-size: medium; font-family: sans-serif;" |U+5916 | lang="zh-Hans" |外 | lang="zh-Hant" |外 | lang="zh-Hant-HK" |外 | lang="ja" |外 | lang="ko" |外 | lang="vi" |{{vi-nom|外}} |style="font-size: medium; font-family: sans-serif;" | outside |- |style="font-size: medium; font-family: sans-serif;" |U+60C5 | lang="zh-Hans" |情 | lang="zh-Hant" |情 | lang="zh-Hant-HK" |情 | lang="ja" |情 | lang="ko" |情 | lang="vi" |{{vi-nom|情}} |style="font-size: medium; font-family: sans-serif;" | feeling |- |style="font-size: medium; font-family: sans-serif;" |U+624D | lang="zh-Hans" |才 | lang="zh-Hant" |才 | lang="zh-Hant-HK" |才 | lang="ja" |才 | lang="ko" |才 | lang="vi" |{{vi-nom|才}} |style="font-size: medium; font-family: sans-serif;" | talent |- |style="font-size: medium; font-family: sans-serif;" |U+62B5 | lang="zh-Hans" |抵 | lang="zh-Hant" |抵 | lang="zh-Hant-HK" |抵 | lang="ja" |抵 | lang="ko" |抵 | lang="vi" |{{vi-nom|抵}} |style="font-size: medium; font-family: sans-serif;" | arrive/resist |- |style="font-size: medium; font-family: sans-serif;" |U+6B21 | lang="zh-Hans" |次 | lang="zh-Hant" |次 | lang="zh-Hant-HK" |次 | lang="ja" |次 | lang="ko" |次 | lang="vi" |{{vi-nom|次}} |style="font-size: medium; font-family: sans-serif;" | secondary/follow |- |style="font-size: medium; font-family: sans-serif;" |U+6D77 | lang="zh-Hans" |海 | lang="zh-Hant" |海 | lang="zh-Hant-HK" |海 | lang="ja" |海 | lang="ko" |海 | lang="vi" |{{vi-nom|海}} |style="font-size: medium; font-family: sans-serif;" | sea |- |style="font-size: medium; font-family: sans-serif;" |U+753B | lang="zh-Hans" |画 | lang="zh-Hant" |画 | lang="zh-Hant-HK" |画 | lang="ja" |画 | lang="ko" |画 | lang="vi" |{{vi-nom|画}} |style="font-size: medium; font-family: sans-serif;" | picture |- |style="font-size: medium; font-family: sans-serif;" |U+76F4 | lang="zh-Hans" |直 | lang="zh-Hant" |直 | lang="zh-Hant-HK" |直 | lang="ja" |直 | lang="ko" |直 | lang="vi" |{{vi-nom|直}} |style="font-size: medium; font-family: sans-serif;" | direct/straight |- |style="font-size: medium; font-family: sans-serif;" |U+771F | lang="zh-Hans" |真 | lang="zh-Hant" |真 | lang="zh-Hant-HK" |真 | lang="ja" |真 | lang="ko" |眞 | lang="vi" |{{vi-nom|真}} |style="font-size: medium; font-family: sans-serif;" | true |- |style="font-size: medium; font-family: sans-serif;" |U+793a | lang="zh-Hans" |示 | lang="zh-Hant" |示 | lang="zh-Hant-HK" |示 | lang="ja" |示 | lang="ko" |示 | lang="vi" |{{vi-nom|示}} |style="font-size: medium; font-family: sans-serif;" | show |- |style="font-size: medium; font-family: sans-serif;" |U+795E | lang="zh-Hans" |神 | lang="zh-Hant" |神 | lang="zh-Hant-HK" |神 | lang="ja" |神 | lang="ko" |神 | lang="vi" |{{vi-nom|神}} |style="font-size: medium; font-family: sans-serif;" | god |- |style="font-size: medium; font-family: sans-serif;" |U+7A7A | lang="zh-Hans" |空 | lang="zh-Hant" |空 | lang="zh-Hant-HK" |空 | lang="ja" |空 | lang="ko" |空 | lang="vi" |{{vi-nom|空}} |style="font-size: medium; font-family: sans-serif;" | empty/air |- |style="font-size: medium; font-family: sans-serif;" |U+8005 | lang="zh-Hans" |者 | lang="zh-Hant" |者 | lang="zh-Hant-HK" |者 | lang="ja" |者 | lang="ko" |者 | lang="vi" |{{vi-nom|者}} |style="font-size: medium; font-family: sans-serif" | one who does/-ist/-er |- |style="font-size: medium; font-family: sans-serif;" |U+8349 | lang="zh-Hans" |草 | lang="zh-Hant" |草 | lang="zh-Hant-HK" |草 | lang="ja" |草 | lang="ko" |草 | lang="vi" |{{vi-nom|草}} |style="font-size: medium; font-family: sans-serif;" | grass |- |style="font-size: medium; font-family: sans-serif;" |U+8525 | lang="zh-Hans" |蔥 | lang="zh-Hant" |蔥 | lang="zh-Hant-HK" |蔥 | lang="ja" |蔥 | lang="ko" |蔥 | lang="vi" |{{vi-nom|蔥}} |style="font-size: medium; font-family: sans-serif;" | onion |- |style="font-size: medium; font-family: sans-serif;" |U+89D2 | lang="zh-Hans" |角 | lang="zh-Hant" |角 | lang="zh-Hant-HK" |角 | lang="ja" |角 | lang="ko" |角 | lang="vi" |{{vi-nom|角}} |style="font-size: medium; font-family: sans-serif;" | edge/horn |- |style="font-size: medium; font-family: sans-serif;" |U+9053 | lang="zh-Hans" |道 | lang="zh-Hant" |道 | lang="zh-Hant-HK" |道 | lang="ja" |道 | lang="ko" |道 | lang="vi" |{{vi-nom|道}} |style="font-size: medium; font-family: sans-serif;" | way/path/road |- |style="font-size: medium; font-family: sans-serif;" |U+96C7 | lang="zh-Hans" |雇 | lang="zh-Hant" |雇 | lang="zh-Hant-HK" |雇 | lang="ja" |雇 | lang="ko" |雇 | lang="vi" |{{vi-nom|雇}} |style="font-size: medium; font-family: sans-serif;" | employ |- |style="font-size: medium; font-family: sans-serif;" |U+9AA8 | lang="zh-Hans" |骨 | lang="zh-Hant" |骨 | lang="zh-Hant-HK" |骨 | lang="ja" |骨 | lang="ko" |骨 | lang="vi" |{{vi-nom|骨}} |style="font-size: medium; font-family: sans-serif;" | bone |} No character variant that is exclusive to Korean or Vietnamese has received its own code point, whereas almost all Shinjitai Japanese variants or Simplified Chinese variants each have distinct code points and unambiguous reference glyphs in the Unicode standard. In the twentieth century, East Asian countries made their own respective encoding standards. Within each standard, there coexisted variants with distinct code points, hence the distinct code points in Unicode for certain sets of variants. Taking Simplified Chinese as an example, the two character variants of {{Lang|zh-Hant-CN|內}} (U+5167) and {{Lang|zh-Hans-CN|内}} (U+5185) differ in exactly the same way as do the Korean and non-Korean variants of {{Lang|ko|全}} (U+5168). Each respective variant of the first character has either {{Lang|zh|入}} (U+5165) or {{Lang|zh|人}} (U+4EBA). Each respective variant of the second character has either {{Lang|zh|入}} (U+5165) or {{Lang|zh|人}} (U+4EBA). Both variants of the first character got their own distinct code points. However, the two variants of the second character had to share the same code point. The justification Unicode gives is that the national standards body in the PRC made distinct code points for the two variations of the first character {{Lang|zh-Hant-CN|內}}/{{Lang|zh-Hans-CN|内}}, whereas Korea never made separate code points for the different variants of {{Lang|ko|全}}. There is a reason for this that has nothing to do with how the domestic bodies view the characters themselves. China went through a process in the twentieth century that changed (if not simplified) several characters. During this transition, there was a need to be able to encode both variants within the same document. Korean has always used the variant of {{lang|ko|全}} with the {{Lang|ko|入}} (U+5165) radical on top. Therefore, it had no reason to encode both variants. Korean language documents made in the twentieth century had little reason to represent both versions in the same document. Almost all of the variants that the PRC developed or standardized got distinct code points owing simply to the fortune of the Simplified Chinese transition carrying through into the computing age. This privilege however, seems to apply inconsistently, whereas most simplifications performed in Japan and mainland China with code points in national standards, including characters simplified differently in each country, did make it into Unicode as distinct code points. Sixty-two Shinjitai "simplified" characters with distinct code points in Japan got merged with their Kyūjitai traditional equivalents, like {{Lang|ja|海}}.{{citation needed|date=August 2022}} This can cause problems for the language tagging strategy. There is no universal tag for the traditional and "simplified" versions of Japanese as there are for Chinese. Thus, any Japanese writer wanting to display the Kyūjitai form of {{Lang|ja|海}} may have to tag the character as "Traditional Chinese" or trust that the recipient's Japanese font uses only the Kyūjitai glyphs, but tags of Traditional Chinese and Simplified Chinese may be necessary to show the two forms side by side in a Japanese textbook. This would preclude one from using the same font for an entire document, however. There are two distinct code points for {{Lang|ja|海}} in Unicode, but only for "compatibility reasons". Any Unicode-conformant font must display the Kyūjitai and Shinjitai versions' equivalent code points in Unicode as the same. Unofficially, a font may display {{Lang|ja|海}} differently with 海 (U+6D77) as the Shinjitai version and 海 (U+FA45) as the Kyūjitai version (which is identical to the traditional version in written Chinese and Korean). The radical {{Lang|zh|糸}} (U+7CF8) is used in characters like {{Lang|zh-Hant|紅}}/{{Lang|zh-Hans|红}}, with two variants, the second form being simply the cursive form. The radical components of {{Lang|zh-Hant|紅}} (U+7D05) and {{Lang|zh-Hans|红}} (U+7EA2) are semantically identical and the glyphs differ only in the latter using a cursive version of the {{Lang|zh|糸}} component. However, in mainland China, the standards bodies wanted to standardize the cursive form when used in characters like {{Lang|zh-Hans|红}}. Because this change happened relatively recently, there was a transition period. Both {{Lang|zh-Hant|紅}} (U+7D05) and {{Lang|zh-Hans|红}} (U+7EA2) got separate code points in the PRC's text encoding standards bodies so Chinese-language documents could use both versions. The two variants received distinct code points in Unicode as well. The case of the radical {{Lang|zh|艸}} (U+8278) proves how arbitrary the state of affairs is. When used to compose characters like {{Lang|zh-Hant|草}} (U+8349), the radical was placed at the top, but had two different forms. Traditional Chinese and Korean use a four-stroke version. At the top of {{lang|zh-Hant|草}} should be something that looks like two plus signs ({{lang|zh-Hant|⺿}}). Simplified Chinese, Kyūjitai Japanese and Shinjitai Japanese use a three-stroke version, like two plus signs sharing their horizontal strokes ({{lang|zh-Hans|⺾}}, i.e. {{lang|zh-Hans|草}}). The PRC's text encoding bodies did not encode the two variants differently. The fact that almost every other change brought about by the PRC, no matter how minor, did warrant its own code point suggests that this exception may have been unintentional. Unicode copied the existing standards as is, preserving such irregularities. The Unicode Consortium has recognized errors in other instances. The myriad Unicode blocks for CJK Han Ideographs have redundancies in original standards, redundancies brought about by flawed importation of the original standards, as well as accidental mergers that are later corrected, providing precedent for dis-unifying characters. For native speakers, variants can be unintelligible or be unacceptable in educated contexts. English speakers may understand a handwritten note saying "4P5 kg" as "495 kg", but writing the nine backwards (so it looks like a "P") can be jarring and would be considered incorrect in any school. Likewise, to users of one CJK language reading a document with "foreign" glyphs: variants of {{Lang|zh|骨}} can appear as mirror images, {{Lang|zh|者}} can be missing a stroke/have an extraneous stroke, and {{Lang|zh|令}} may be unreadable to Non-Japanese people. (In Japan, both variants are accepted).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)