Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Han unification
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Effort to map CJK characters in Unicode}} {{redirect|Unihan|the historical period|Chu–Han Contention|other uses}} {{Multiple issues| {{more citations needed|date=February 2010}} {{Overly detailed|date=December 2020}} {{Original research|date=February 2024}} }} [[File:Source Han Sans Version Difference.svg|thumb|right|Differences for the same Unicode [[code point]] (U+8FD4) in regional versions of [[Source Han Sans]]]] {{SpecialChars}} '''Han unification''' is an effort by the authors of [[Unicode]] and the [[Universal Character Set]] to map multiple [[character set]]s of the [[Han characters]] of the so-called [[CJK characters|CJK]] languages into a single set of unified [[grapheme|characters]]. Han characters are a feature shared in common by written [[Chinese language|Chinese]] ([[hanzi]]), [[Japanese language|Japanese]] ([[kanji]]), [[Korean language|Korean]] ([[hanja]]) and [[Vietnamese language|Vietnamese]] ([[chữ Hán]]). Modern Chinese, Japanese and Korean [[typeface]]s typically use regional or historical [[variant Chinese characters|variants of a given Han character]]. In the formulation of Unicode, an attempt was made to unify these variants by considering them as [[allograph]]s{{snd}}different [[glyphs]] representing the same "grapheme" or [[orthography|orthographic]] unit{{snd}} hence, "Han unification", with the resulting character repertoire sometimes contracted to '''Unihan'''.<ref>{{cite web |title=Unicode® Standard Annex #38 {{!}} UNICODE HAN DATABASE (UNIHAN) |url=https://www.unicode.org/reports/tr38/ |publisher=[[Unicode Consortium]] |date=2023-09-01 }}</ref>{{efn|Unihan can also refer to the Unihan Database maintained by the [[Unicode Consortium]], which provides information about all of the unified Han characters encoded in the Unicode Standard, including mappings to various national and industry standards, indices into standard dictionaries, encoded variants, pronunciations in various languages, and an English definition. The database is available to the public as text files<ref name="UnihanZip">{{cite web|url=https://www.unicode.org/Public/UNIDATA/Unihan.zip|title=Unihan.zip|work=The Unicode Standard|publisher=Unicode Consortium}}</ref> and via an interactive website.<ref name="UnihanLookup">{{cite web|url=https://www.unicode.org/charts/unihan.html|title=Unihan Database Lookup|work=The Unicode Standard|publisher=Unicode Consortium}}</ref><ref>{{cite web|url=https://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=4E2D|title=Unihan Database Lookup: Sample lookup for 中|work=The Unicode Standard|publisher=Unicode Consortium}}</ref> The latter also includes representative glyphs and definitions for compound words drawn from the free Japanese [[EDICT]] and Chinese [[CEDICT]] dictionary projects (which are provided for convenience and are not a formal part of the Unicode Standard).}} Nevertheless, many characters have regional variants assigned to different [[code point]]s, such as [[Traditional Chinese characters|Traditional]] {{Lang|zh-Hant|個}} (U+500B) versus [[Simplified Chinese characters|Simplified]] {{Lang|zh-Hans|个}} (U+4E2A).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)