Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Character encoding
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Common character encodings == {{Main|Popularity of text encodings}} {{Expand section|Popularity and comparison: * Statistics on popularity * Especially, a comparison of the advantages and disadvantages of the few 3-5 most common character encodings (e.g. UTF-8, UTF-16 and UTF-32)|date=June 2024}} The [[Popularity of text encodings|most used character encoding]] on the [[World Wide Web|web]] is [[UTF-8]], used in 98.2% of surveyed web sites, as of May 2024.<ref name="W3TechsWebEncoding" /> In [[Application software|application programs]] and [[operating system]] tasks, both UTF-8 and [[UTF-16]] are popular options.<ref name=":0" /><ref name=":1">{{Cite web |last=Galloway |first=Matt |date=9 October 2012 |title=Character encoding for iOS developers. Or UTF-8 what now? |url=https://www.galloway.me.uk/2012/10/character-encoding-for-ios-developers-utf8/ |access-date=2021-01-02 |website=Matt Galloway |language=en |quote=in reality, you usually just assume UTF-8 since that is by far the most common encoding. }}</ref> {{Div col|colwidth=30em}} * [[ISO/IEC 646|ISO 646]] ** [[ASCII]] * [[EBCDIC]] * [[ISO/IEC 8859|ISO 8859]]: ** [[ISO/IEC 8859-1|ISO 8859-1]] Western Europe ** [[ISO/IEC 8859-2|ISO 8859-2]] Western and Central Europe ** [[ISO/IEC 8859-3|ISO 8859-3]] Western Europe and South European (Turkish, Maltese plus Esperanto) ** [[ISO/IEC 8859-4|ISO 8859-4]] Western Europe and Baltic countries (Lithuania, Estonia, Latvia and Lapp) ** [[ISO/IEC 8859-5|ISO 8859-5]] Cyrillic alphabet ** [[ISO/IEC 8859-6|ISO 8859-6]] Arabic ** [[ISO/IEC 8859-7|ISO 8859-7]] Greek ** [[ISO/IEC 8859-8|ISO 8859-8]] Hebrew ** [[ISO/IEC 8859-9|ISO 8859-9]] Western Europe with amended Turkish character set ** [[ISO/IEC 8859-10|ISO 8859-10]] Western Europe with rationalised character set for Nordic languages, including complete Icelandic set ** [[ISO/IEC 8859-11|ISO 8859-11]] Thai ** [[ISO/IEC 8859-13|ISO 8859-13]] Baltic languages plus Polish ** [[ISO/IEC 8859-14|ISO 8859-14]] Celtic languages (Irish Gaelic, Scottish, Welsh) ** [[ISO/IEC 8859-15|ISO 8859-15]] Added the Euro sign and other rationalisations to ISO 8859-1 ** [[ISO/IEC 8859-16|ISO 8859-16]] Central, Eastern and Southern European languages (Albanian, Bosnian, Croatian, Hungarian, Polish, Romanian, Serbian and Slovenian, but also French, German, Italian and Irish Gaelic) * [[Code page 437|CP437]], CP720, [[Code page 737|CP737]], [[Code page 850|CP850]], CP852, CP855, CP857, [[Code page 858|CP858]], CP860, [[Code page 861|CP861]], [[Code page 862|CP862]], [[Code page 863|CP863]], [[Code page 865|CP865]], [[Code page 866|CP866]], [[Code page 869|CP869]], [[Code page 872|CP872]] * [[Windows code page|MS-Windows character sets]]: ** [[Windows-1250]] for Central European languages that use Latin script, (Polish, Czech, Slovak, Hungarian, Slovene, Serbian, Croatian, Bosnian, Romanian and Albanian) ** [[Windows-1251]] for Cyrillic alphabets ** [[Windows-1252]] for Western languages ** [[Windows-1253]] for Greek ** [[Windows-1254]] for Turkish ** [[Windows-1255]] for Hebrew ** [[Windows-1256]] for Arabic ** [[Windows-1257]] for Baltic languages ** [[Windows-1258]] for Vietnamese * [[Mac OS Roman]] * [[KOI8-R]], [[KOI8-U]], [[KOI7]] * [[MIK Code page|MIK]] * [[Indian Script Code for Information Interchange|ISCII]] * [[Tamil Script Code for Information Interchange|TSCII]] * [[Vietnamese Standard Code for Information Interchange|VISCII]] * [[JIS X 0208]] is a widely deployed standard for Japanese character encoding that has several encoding forms. ** [[Shift JIS]] (Microsoft [[Code page 932 (Microsoft Windows)|Code page 932]] is a dialect of Shift_JIS) ** [[Extended Unix Code|EUC-JP]] ** [[ISO/IEC 2022|ISO-2022-JP]] * [[JIS X 0213]] is an extended version of JIS X 0208. ** [[Shift JIS|Shift_JIS-2004]] ** [[Extended Unix Code|EUC-JIS-2004]] ** [[ISO/IEC 2022|ISO-2022-JP-2004]] * Chinese [[List of GB standards|Guobiao]] ** [[GB 2312]] ** [[GBK (character encoding)|GBK]] (Microsoft Code page 936) ** [[GB 18030]] * Taiwan [[Big5]] (a more famous variant is Microsoft [[Code page 950]]) ** Hong Kong [[HKSCS]] * Korean ** [[KS X 1001]] is a Korean double-byte character encoding standard ** [[Extended Unix Code#EUC-KR|EUC-KR]] ** [[ISO/IEC 2022|ISO-2022-KR]] * [[Unicode]] (and subsets thereof, such as the 16-bit 'Basic Multilingual Plane') ** [[UTF-8]] ** [[UTF-16]] ** [[UTF-32]] * [[ANSEL]] or [[ISO/IEC 6937]] {{Div col end}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)