Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Logogram
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Characters in information technology === Entering complex characters can be cumbersome on electronic devices due to a practical limitation in the number of input keys. There exist various [[input method]]s for entering logograms, either by breaking them up into their constituent parts such as with the [[Cangjie input method|Cangjie]] and [[Wubi method]]s of typing Chinese, or using phonetic systems such as [[Bopomofo]] or [[Pinyin]] where the word is entered as pronounced and then selected from a list of logograms matching it. While the former method is (linearly) faster, it is more difficult to learn. With the Chinese alphabet system however, the strokes forming the logogram are typed as they are normally written, and the corresponding logogram is then entered.{{clarify|date=November 2017|reason=What is "the Chinese alphabet system"? No such system has been mentioned yet.}} Also due to the number of glyphs, in programming and computing in general, more memory is needed to store each grapheme, as the character set is larger. As a comparison, [[ISO 8859]] requires only one [[byte]] for each grapheme, while the [[Basic Multilingual Plane]] encoded in [[UTF-8]] requires up to three bytes. On the other hand, English words, for example, average five characters and a space per word<ref>{{cite web |first=David |last=Hearle |title=Sentence and word length |url= http://hearle.nahoo.net/Academic/Maths/Sentence.html |publisher=self-published |access-date=27 May 2007}} {{self-published source|date=November 2017|certain=y}}</ref>{{self-published inline|date=November 2017|certain=y}} and thus need six bytes for every word. Since many logograms contain more than one grapheme, it is not clear which is more memory-efficient. [[Variable-width encoding]]s allow a unified character encoding standard such as [[Unicode]] to use only the bytes necessary to represent a character, reducing the overhead that results merging large character sets with smaller ones.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)