Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Letter case
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=={{Anchor | Case folding}}Case folding and case conversion== In the [[character set]]s developed for [[computing]], each upper- and lower-case letter is encoded as a separate character. In order to enable case folding and case conversion, the [[software]] needs to link together the two characters representing the case variants of a letter. (Some old character-encoding systems, such as the [[Baudot code]], are restricted to one set of letters, usually represented by the upper-case variants.) [[Case sensitivity|Case-insensitive]] operations can be said to fold case, from the idea of folding the character code table so that upper- and lower-case letters coincide. The conversion of letter case in a [[String (computer science)|string]] is common practice in computer applications, for instance to make case-insensitive comparisons. Many high-level programming languages provide simple methods for case conversion, at least for the [[ASCII]] character set. Whether or not the case variants are treated as equivalent to each other varies depending on the computer system and context. For example, user [[password]]s are generally case sensitive in order to allow more diversity and make them more difficult to break. In contrast, case is often ignored in [[keyword search]]es in order to ignore insignificant variations in keyword capitalisation both in queries and queried material. ===Unicode case folding and script identification=== [[Unicode]] defines case folding through the three case-mapping properties of each [[Character (computing)|character]]: upper case, lower case, and title case (in this context, "title case" relates to [[Typographic ligature|ligature]]s and [[Digraph (orthography)|digraph]]s encoded as mixed-case [[Digraph (orthography)#In Unicode|single characters]], in which the first component is in upper case and the second component in lower case).<ref>{{cite web | url = http://unicode.org/faq/casemap_charprop.html#4 | title = Character Properties, Case Mappings & Names FAQ | publisher = Unicode | access-date = 19 February 2017}}</ref> These properties relate all characters in scripts with differing cases to the other case variants of the character. As briefly discussed in [[Unicode]] Technical Note #26,<ref name="Unicode" /> "In terms of implementation issues, any attempt at a unification of Latin, Greek, and Cyrillic would wreak havoc [and] make casing operations an unholy mess, in effect making all casing operations context sensitive […]". In other words, while the shapes of letters like '''A''', '''B''', '''E''', '''H''', '''K''', '''M''', '''O''', '''P''', '''T''', '''X''', '''Y''' and so on are shared between the Latin, Greek, and Cyrillic alphabets (and small differences in their canonical forms may be considered to be of a merely [[Typography|typographical]] nature), it would still be problematic for a multilingual [[character set]] or a [[font]] to provide only a ''single'' [[code point]] for, say, uppercase letter '''B''', as this would make it quite difficult for a wordprocessor to change that single uppercase letter to one of the three different choices for the lower-case letter, the Latin '''b''' (U+0062), Greek '''β''' (U+03B2) or Cyrillic '''в''' (U+0432). Therefore, the corresponding Latin, Greek and Cyrillic upper-case letters (U+0042, U+0392 and U+0412, respectively) are also encoded as separate characters, despite their appearance being identical. Without letter case, a "unified European alphabet"{{spaced ndash}}such as '''ABБCГDΔΕЄЗFΦGHIИJ'''...'''Z''', with an appropriate subset for each language{{spaced ndash}}is feasible; but considering letter case, it becomes very clear that these alphabets are rather distinct sets of symbols. ===Methods in word processing=== Most modern [[word processor]]s provide automated case conversion with a simple click or keystroke. For example, in Microsoft Office Word, there is a dialog box for toggling the selected text through UPPERCASE, then lowercase, then Title Case (actually start caps; exception words must be lowercased individually). The keystroke {{keypress|shift|F3}} does the same. ===Methods in programming=== In some forms of [[BASIC]] there are two methods for case conversion: <syntaxhighlight lang="qbasic"> UpperA$ = UCASE$("a") LowerA$ = LCASE$("A") </syntaxhighlight> [[C (programming language)|C]] and [[C++]], as well as any C-like language that conforms to its [[C standard library|standard library]], provide these functions in the file [[ctype.h]]: <syntaxhighlight lang="c"> char upperA = toupper('a'); char lowerA = tolower('A'); </syntaxhighlight> Case conversion is different with different [[character sets]]. In [[ASCII]] or [[EBCDIC]], case can be converted in the following way, in C: <syntaxhighlight lang="c"> int toupper(int c) { return islower(c) ? c – 'a' + 'A' : c; } int tolower(int c) { return isupper(c) ? c – 'A' + 'a' : c; } </syntaxhighlight> This only works because the letters of upper and lower cases are spaced out equally. In ASCII they are consecutive, whereas with EBCDIC they are not; nonetheless the upper-case letters are arranged in the same pattern and with the same gaps as are the lower-case letters, so the technique still works. Some computer programming languages offer facilities for converting text to a form in which all words are capitalised. [[Visual Basic]] calls this "proper case"; [[Python (programming language)|Python]] calls it "title case". This differs from usual [[Capitalization#Title case|title casing]] conventions, such as the English convention in which minor words are not capitalised.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)