Editing Code page (section)

== Criticism ==
Many older character encodings (unlike Unicode) suffer from several problems. Some vendors insufficiently document the meaning of all code point values in their code pages, which decreases the reliability of handling textual data consistently through various computer systems. Some vendors add proprietary extensions to established code pages, to add or change certain code point values: for example, byte 0x5C in [[Shift JIS]] can represent either a [[back slash]] or a [[yen sign]] depending on the platform. Finally, in order to support several languages in a program that does not use Unicode, the code page used for each string/document needs to be stored.

Applications may also mislabel text in [[Windows-1252]] as [[ISO-8859-1]]. The only difference between these code pages is that the code point values in the range 0x80{{ndash}}0x9F, used by ISO-8859-1 for control characters, are instead used as additional printable characters in Windows-1252{{snd}} notably for [[quotation marks]], the [[euro sign]] and the [[trademark symbol]] among others. Browsers on non-Windows platforms would tend to show empty boxes or question marks for these characters, making the text hard to read. Most browsers fixed this by ignoring the character set and interpreting as Windows-1252 to look acceptable. In HTML5, treating ISO-8859-1 as Windows-1252 is even codified as a [[W3C]] standard.<ref>{{cite web |url=https://encoding.spec.whatwg.org/#names-and-labels |title=Encoding |at=sec. 4.2 Names and labels |publisher=[[WHATWG]] |date=27 January 2015 |access-date=4 February 2015 |archive-url=https://web.archive.org/web/20150204174315/https://encoding.spec.whatwg.org/#names-and-labels |archive-date=4 February 2015 |url-status=live}}</ref> Although browsers were typically programmed to deal with this behaviour, this was not always true of other software. Consequently, when receiving a file transfer from a Windows system, non-Windows platforms would either ignore these characters or treat them as a standard control characters and attempt to take the specified control action accordingly.

Due to Unicode's extensive documentation, vast repertoire of characters and stability policy of characters, the problems listed above are rarely a concern for Unicode.  [[UTF-8]] (which can encode over one million codepoints) has replaced the code-page method in terms of popularity on the Internet.<ref name="Statistics"/><ref name="Statistics_UTF-8"/>