Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Unicode
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Standardized subsets === Several subsets of Unicode are standardized: Microsoft Windows since [[Windows NT 4.0]] supports [[WGL-4]] with 657 characters, which is considered to support all contemporary European languages using the Latin, Greek, or Cyrillic script. Other standardized subsets of Unicode include the Multilingual European Subsets:<ref>[https://www.evertype.com/standards/iso10646/pdf/cwa13873.pdf CWA 13873:2000 β Multilingual European Subsets in ISO/IEC 10646-1] [[European Committee for Standardization|CEN]] Workshop Agreement 13873</ref> MES-1 (Latin scripts only; 335 characters), MES-2 (Latin, Greek, and Cyrillic; 1062 characters)<ref>{{Cite web |last = Kuhn |first = Markus |author-link = Markus Kuhn (computer scientist) |date = 1998 |title=Multilingual European Character Set 2 (MES-2) Rationale |url=https://www.cl.cam.ac.uk/~mgk25/ucs/mes-2-rationale.html |access-date=20 March 2023 |publisher=University of Cambridge}}</ref> and MES-3A & MES-3B (two larger subsets, not shown here). MES-2 includes every character in MES-1 and WGL-4. The standard [[DIN 91379]]<ref>{{Cite web |title=DIN 91379:2022-08: Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM |url=https://www.beuth.de/en/standard/din-91379/353496133 |access-date=21 August 2022 |publisher=Beuth Verlag}}</ref> specifies a subset of Unicode letters, special characters, and sequences of letters and diacritic signs to allow the correct representation of names and to simplify data exchange in Europe. This standard supports all of the official languages of all European Union countries, as well as the German minority languages and the official languages of Iceland, Liechtenstein, Norway, and Switzerland. To allow the transliteration of names in other writing systems to the Latin script according to the relevant ISO standards, all necessary combinations of base letters and diacritic signs are provided. {| class="wikitable" |+ {{nobold|'''WGL-4''', ''MES-1'' and MES-2}} |- ! Row !! Cells !! Range(s) |- !rowspan="2"| 00 | '''''20β7E''''' | [[Basic Latin (Unicode block)|Basic Latin]] (00β7F) |- | '''''A0βFF''''' | [[Latin-1 Supplement (Unicode block)|Latin-1 Supplement]] (80βFF) |- !rowspan="2"| 01 | '''''00β13,'' 14β15, ''16β2B,'' 2Cβ2D, ''2Eβ4D,'' 4Eβ4F, ''50β7E,'' 7F''' | [[Latin Extended-A]] (00β7F) |- | 8F, '''92,''' B7, DE-EF, '''FAβFF''' | [[Latin Extended-B]] (80βFF <span title="U+024F">...</span>) |- !rowspan="3"| 02 | 18β1B, 1Eβ1F | Latin Extended-B (<span title="U+00180">...</span> 00β4F) |- | 59, 7C, 92 | [[IPA Extensions]] (50βAF) |- | BBβBD, '''C6, ''C7,'' C9,''' D6, '''''D8βDB,'' DC, ''DD,''''' DF, EE | [[Spacing Modifier Letters]] (B0βFF) |- ! 03 | 74β75, 7A, 7E, '''84β8A, 8C, 8EβA1, A3βCE,''' D7, DAβE1 | [[Greek and Coptic|Greek]] (70βFF) |- ! 04 | '''00β5F, 90β91,''' 92βC4, C7βC8, CBβCC, D0βEB, EEβF5, F8βF9 | [[Cyrillic (Unicode block)|Cyrillic]] (00βFF) |- ! 1E | 02β03, 0Aβ0B, 1Eβ1F, 40β41, 56β57, 60β61, 6Aβ6B, '''80β85,''' 9B, '''F2βF3''' | [[Latin Extended Additional]] (00βFF) |- ! 1F | 00β15, 18β1D, 20β45, 48β4D, 50β57, 59, 5B, 5D, 5Fβ7D, 80βB4, B6βC4, C6βD3, D6βDB, DDβEF, F2βF4, F6βFE | [[Greek Extended]] (00βFF) |- !rowspan="3"| 20 | '''13β14, ''15,'' 17, ''18β19,'' 1Aβ1B, ''1Cβ1D,'' 1E, 20β22, 26, 30, 32β33, 39β3A, 3C, 3E, 44,''' 4A | [[General Punctuation]] (00β6F) |- | '''7F''', 82 | [[Superscripts and Subscripts]] (70β9F) |- | '''A3βA4, A7, ''AC,''''' AF | [[Currency Symbols (Unicode block)|Currency Symbols]] (A0βCF) |- !rowspan="3"| 21 | '''05, 13, 16, ''22, 26,'' 2E''' | [[Letterlike Symbols]] (00β4F) |- | '''''5Bβ5E''''' | [[Number Forms]] (50β8F) |- | '''''90β93,'' 94β95, A8''' | [[Arrows (Unicode block)|Arrows]] (90βFF) |- ! 22 | 00, '''02,''' 03, '''06,''' 08β09, '''0F, 11β12, 15, 19β1A, 1Eβ1F,''' 27β28, '''29,''' 2A, '''2B, 48,''' 59, '''60β61, 64β65,''' 82β83, 95, 97 | [[Mathematical Operators]] (00βFF) |- ! 23 | '''02, 0A, 20β21,''' 29β2A | [[Miscellaneous Technical]] (00βFF) |- !rowspan="3"| 25 | '''00, 02, 0C, 10, 14, 18, 1C, 24, 2C, 34, 3C, 50β6C''' | [[Box Drawing]] (00β7F) |- | '''80, 84, 88, 8C, 90β93''' | [[Block Elements]] (80β9F) |- | '''A0βA1, AAβAC, B2, BA, BC, C4, CAβCB, CF, D8βD9, E6''' | [[Geometric Shapes (Unicode block)|Geometric Shapes]] (A0βFF) |- ! 26 | '''3Aβ3C, 40, 42, 60, 63, 65β66, ''6A,'' 6B''' | [[Miscellaneous Symbols]] (00βFF) |- ! F0 | (01β02)<!--in WGL-4, but not in MES-2--> | [[Private Use Area (Unicode block)|Private Use Area]] (00βFF ...) |- ! FB | '''01β02''' | [[Alphabetic Presentation Forms]] (00β4F) |- ! FF | FD | [[Specials (Unicode block)|Specials]] |} Rendering software that cannot process a Unicode character appropriately often displays it as an open rectangle, or as {{tt|U+FFFD}} to indicate the position of the unrecognized character. Some systems have made attempts to provide more information about such characters. Apple's [[Last Resort font]] will display a substitute glyph indicating the Unicode range of the character, and the [[SIL International]]'s [[Unicode fallback font]] will display a box showing the hexadecimal scalar value of the character.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)