Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
UTF-16
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Size == A "character" may use any number of Unicode code points.<ref name="extended grapheme">{{Cite web|title=It's not wrong that "π€¦πΌββοΈ".length == 7|url=https://hsivonen.fi/string-length/|access-date=2021-03-15|website=hsivonen.fi}}</ref> For instance an [[regional indicator symbol|emoji flag character]] takes 8 bytes, since it is "constructed from a pair of Unicode scalar values"<ref>{{Cite web|title=Apple Developer Documentation|url=https://developer.apple.com/documentation/swift/string|access-date=2021-03-15|website=developer.apple.com}}</ref> (and those values are outside the BMP and require 4 bytes each). UTF-16 in no way assists in "counting characters" or in "measuring the width of a string". UTF-16 is often claimed to be more space-efficient than [[UTF-8]] for East Asian languages, since it uses two bytes for characters that take 3 bytes in UTF-8. Since real text contains many spaces, numbers, punctuation, markup (for e.g. web pages), and control characters, which take only one byte in UTF-8, this is only true for artificially constructed dense blocks of text.{{citation needed|date=November 2022}} A more serious claim can be made for [[Devanagari]] and [[Bengali language|Bengali]], which use multi-letter words and all the letters take 3 bytes in UTF-8 and only 2 in UTF-16. In addition the Chinese Unicode encoding standard [[GB 18030]] always produces files the same size or smaller than UTF-16 for all languages, not just for Chinese (it does this by sacrificing self-synchronization).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)