Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Alphabetical order
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Ordering in the Latin script== ===Basic order and examples=== The standard order of the modern [[ISO basic Latin alphabet]] is: :'''A-B-C-D-E-F-G-H-I-J-K-L-M-N-O-P-Q-R-S-T-U-V-W-X-Y-Z''' An example of straightforward alphabetical ordering follows: *'''''As; Aster; Astrolabe; Astronomy; Astrophysics; At; Ataman; Attack; Baa''''' Another example: *'''''Barnacle; Be; Been; Benefit; Bent''''' The above words are ordered alphabetically. ''As'' comes before ''Aster'' because they begin with the same two letters and ''As'' has no more letters after that whereas ''Aster'' does. The next three words come after ''Aster'' because their fourth letter (the first one that differs) is ''r'', which comes after ''e'' (the fourth letter of ''Aster'') in the alphabet. Those words themselves are ordered based on their sixth letters (''l'', ''n'' and ''p'' respectively). Then comes ''At'', which differs from the preceding words in the second letter (''t'' comes after ''s''). ''Ataman'' comes after ''At'' for the same reason that ''Aster'' came after ''As''. ''Attack'' follows ''Ataman'' based on comparison of their third letters, and ''Baa'' comes after all of the others because it has a different first letter. ===Treatment of multiword strings=== When some of the strings being ordered consist of more than one word, i.e., they contain [[space (character)|spaces]] or other separators such as [[hyphen]]s, then two basic approaches may be taken. In the first approach, all strings are ordered initially according to their first word, as in the sequence: *''Oak; Oak Hill; Oak Ridge; Oakley Park; Oakley River'' *:where all strings beginning with the separate word ''Oak'' precede all those beginning with ''Oakley'', because ''Oak'' precedes ''Oakley'' in alphabetical order. In the second approach, strings are alphabetized as if they had no spaces or hyphens,{{efn|In MS Explorer's case, the space, the apostrophe (U+0027), and all of the hyphen-like characters (U+002D and U+2010 through U+2014, inclusive) are omitted from the primary sort key.}} giving the sequence: *''Oak; Oak Hill; Oakley Park; Oakley River; Oak Ridge'' *:where ''Oak Ridge'' now comes after the ''Oakley'' strings, as it would if it were written "Oakridge". The second approach is the one usually taken in dictionaries,{{citation needed|date=October 2020}} and it is thus often called ''[[dictionary order (disambiguation)|dictionary order]]'' by [[publishing|publishers]].{{efn|For instance, the ''Harrap's Shorter Dictionnaire Anglais-Français/ Français-Anglais'', ISBN 0-245-60660-2, page 640, has the ordering ''oil, oil-bearing, oilcan, oilcloth, oil-cooled, oiled ''[…]'' oiliness, oil lamp, oil paint, oil painting, oilpaper''.}} The first approach has often been used in [[index (publishing)|book indexes]], although each publisher traditionally set its own standards for which approach to use therein; there was no ISO standard for book indexes ([[ISO 999]]) before 1975. ===Special cases=== {{Unreferenced section|date=June 2017}} ====Modified letters==== In French, modified letters (such as those with [[diacritic]]s) are treated the same as the base letter for alphabetical ordering purposes. For example, ''rôle'' comes between ''rock'' and ''rose'', as if it were written ''role''. However, languages that use such letters systematically generally have their own ordering rules. See {{slink||Language-specific conventions}} below. ====Ordering by surname==== In most cultures where [[family name]]s are written after [[given name]]s, it is still desired to sort lists of names (as in telephone directories) by family name first. In this case, names need to be reordered to be sorted correctly. For example, Juan Hernandes and Brian O'Leary should be sorted as "Hernandes, Juan" and "O'Leary, Brian" even if they are not written this way. Capturing this rule in a computer collation algorithm is complex, and simple attempts will fail. For example, unless the algorithm has at its disposal an extensive list of family names, there is no way to decide if "Gillian Lucille van der Waal" is "van der Waal, Gillian Lucille", "Waal, Gillian Lucille van der", or even "Lucille van der Waal, Gillian". Ordering by surname is frequently encountered in academic contexts. Within a single multi-author paper, ordering the authors alphabetically by surname, rather than by other methods such as reverse seniority or subjective degree of contribution to the paper, is seen as a way of "acknowledg[ing] similar contributions" or "avoid[ing] disharmony in collaborating groups".<ref>{{cite journal|first1=Teja|last1=Tscharntke|first2=Michael E|last2=Hochberg|first3=Tatyana A|last3=Rand|first4=Vincent H|last4=Resh|first5=Jochen|last5=Krauss|title=Author Sequence and Credit for Contributions in Multiauthored Publications|journal=PLOS Biol.|date=January 2007|volume=5|issue=1|pages=e18|pmid=17227141|doi=10.1371/journal.pbio.0050018|pmc=1769438 |doi-access=free }}</ref> The practice in certain fields of ordering [[citation]]s in bibliographies by the surnames of their authors has been found to create bias in favour of authors with surnames which appear earlier in the alphabet, while this effect does not appear in fields in which bibliographies are ordered chronologically.<ref>{{cite journal|url=https://decisionslab.unl.edu/pubs/stevens_duque_2018_SM.pdf|first1=Jeffrey R.|last1=Stevens|first2=Juan F.|last2=Duque|title=Order Matters: Alphabetizing In-Text Citations Biases Citation Rates|journal=Psychonomic Bulletin & Review|year=2018|volume=26|issue=3|pages=1020–1026|doi=10.3758/s13423-018-1532-8|doi-access=free|pmid=30288671|s2cid=52922399|access-date=10 November 2018|archive-date=10 November 2018|archive-url=https://web.archive.org/web/20181110080311/https://decisionslab.unl.edu/pubs/stevens_duque_2018_SM.pdf|url-status=live}} *{{lay source |author=Colleen Flaherty |title=The Case Against Alphabetical Naming of Authors |url=https://www.insidehighered.com/news/2018/10/22/study-takes-aim-psychologys-practice-ordering-reference-lists-alphabetically|website=[[Inside Higher Ed]] |date=22 October 2018}}</ref> ====''The'' and other common words==== If a phrase begins with a very common word (such as "the", "a" or "an", called articles in grammar), that word is sometimes ignored or moved to the end of the phrase, but this is not always the case. For example, the book "[[The Shining (novel)|The Shining]]" might be treated as "Shining", or "Shining, The" and therefore before the book title "[[Summer of Sam]]". However, it may also be treated as simply "The Shining" and after "Summer of Sam". Similarly, "[[A Wrinkle in Time]]" might be treated as "Wrinkle in Time", "Wrinkle in Time, A", or "A Wrinkle in Time". All three alphabetization methods are fairly easy to create by algorithm, but many programs rely on simple [[lexicographic order]]ing instead. ====''Mac'' prefixes==== {{Main|Mac and Mc together}} The prefixes ''M'' and ''Mc'' in Irish and Scottish surnames are abbreviations for ''Mac'' and are sometimes alphabetized as if the spelling is ''Mac'' in full. Thus ''McKinley'' might be listed before ''Mackintosh'' (as it would be if it had been spelled out as "MacKinley"). Since the advent of computer-sorted lists, this type of alphabetization is less frequently encountered, though it is still used in British telephone directories. ====''St'' prefix==== The prefix ''St'' or ''St.'' is an abbreviation of "Saint", and is traditionally alphabetized as if the spelling is ''Saint'' in full. Thus in a gazetteer ''St John's'' might be listed before ''Salem'' (as if it would be if it had been spelled out as "Saint John's"). Since the advent of computer-sorted lists, this type of alphabetization is less frequently encountered, though it is still sometimes used. ====Ligatures==== [[Typographic ligature|Ligatures]] (two or more letters merged into one symbol) which are not considered distinct letters, such as [[Æ]] and [[Œ]] in English, are typically collated as if the letters were separate—"æther" and "aether" would be ordered the same relative to all other words. This is true even when the ligature is not purely stylistic, such as in [[loanword]]s and brand names. Special rules may need to be adopted to sort strings which vary only by whether two letters are joined by a ligature. ===Treatment of numerals=== {{Main|Lexicographical order}} {{Unreferenced section|date=June 2017}} When some of the strings contain [[Numerical digit|numeral]]s (or other non-letter characters), various approaches are possible. Sometimes such characters are treated as if they came before or after all the letters of the alphabet. Another method is for numbers to be sorted alphabetically as they would be spelled: for example ''[[1776 (film)|1776]]'' would be sorted as if spelled out "seventeen seventy-six", and {{Lang|fr|[[24 heures du Mans]]}} as if spelled "vingt-quatre..." (French for "twenty-four"). When numerals or other symbols are used as special graphical forms of letters, as ''1337'' for [[leet]] or the movie ''[[Seven (1995 film)|Seven]]'' (which was stylised as ''Se7en''), they may be sorted as if they were those letters. [[Natural sort order]] orders strings alphabetically, except that multi-digit numbers are treated as a single character and ordered by the value of the number encoded by the digits. In the case of [[monarch]]s and [[pope]]s, although their numbers are in [[Roman numerals]] and resemble letters, they are normally arranged in numerical order: so, for example, even though V comes after I, the Danish king [[Christian IX of Denmark|Christian IX]] comes after his predecessor [[Christian VIII of Denmark|Christian VIII]]. ===Language-specific conventions=== {{more citations needed|section|date=June 2017}} Languages which use an [[extended Latin alphabet]] generally have their own conventions for treatment of the extra letters. Also in some languages certain [[digraph (orthography)|digraph]]s are treated as single letters for collation purposes. For example, the [[Spanish orthography|Spanish alphabet]] treats ''ñ'' as a basic letter following ''n'', and formerly treated the digraphs ''ch'' and ''ll'' as basic letters following ''c'' and ''l'', respectively. Now ''ch'' and ''ll'' are alphabetized as two-letter combinations. The new alphabetization rule was issued by the [[Royal Spanish Academy]] in 1994. These digraphs were still formally designated as letters but they are no longer so since 2010. On the other hand, the digraph ''rr'' follows ''rqu'' as expected (and did so even before the 1994 alphabetization rule), while vowels with acute accents (''á, é, í, ó, ú'') have always been ordered in parallel with their base letters, as has the letter ''ü''. In a few cases, such as [[Arabic alphabet|Arabic]] and [[Kiowa alphabet|Kiowa]], the alphabet has been completely reordered. Alphabetization rules applied in various languages are listed below. * In [[Arabic Language|Arabic]], there are two main orders of the [[Arabic alphabet|28 letter alphabet]] used today. The standard and most commonly used is the ''[[Arabic alphabet#hijāʾī|hijāʾī]]'' order, which was created by the early Arab linguist [[Nasr ibn 'Asim al-Laythi]] and features a visual ordering method where letters are ordered based on their shapes. For example ''bāʾ'' (ب), ''tāʾ'' (ت), ''thāʾ'' (ث) are grouped as they have the same base shape or ''[[rasm]]'' (ٮ) and are differentiated only by consonant pointing known as ''[[Arabic diacritics#I‘jām (phonetic distinctions of consonants)|iʻjām]]''. The original ''[[Arabic alphabet#Abjadi|ʾabjadī]]'' order, which phonetically resembles that of other [[Semitic languages]] as well as Latin, is still in use today, usually limited for ordering lists in a document, analogous to [[Roman Numerals]]. When the ''ʾabjadī'' order is used in numbering, letters are written in a modified form to distinguish them from letters used in words and from numerals. For example, ''ʾalif'' (ا) which looks identical to the [[Eastern Arabic numeral]] one (١), a small oval loop extends clockwise of the letter's bottom, followed by a short tail (𞺀).{{cn|date=August 2024}} Although these characters are rarely used digitally they are encoded in Unicode under [[Arabic Mathematical Alphabetic Symbols]].<ref>{{cite web |title=Arabic Mathematical Alphabetic Symbols |url=https://www.unicode.org/charts/PDF/U1EE00.pdf |publisher=THE Unicode Standard |access-date=26 November 2022 |archive-date=30 October 2022 |archive-url=https://web.archive.org/web/20221030230610/https://www.unicode.org/charts/PDF/U1EE00.pdf |url-status=live }}</ref> A less common order, the ''{{ill|ṣawtī|ar|ترتيب_صوتي|v=sup}}'' order, is collated phonetically and was created by [[al-Khalil ibn Ahmad al-Farahidi]]. * In [[Azerbaijani language|Azerbaijani]], there are eight additional letters to the standard Latin alphabet. Five of them are vowels: i, ı, ö, ü, [[ə]] and three are consonants: ç, ş, ğ. The alphabet is the same as the [[Turkish alphabet|Turkish]], with the same sounds written with the same letters, except for three additional letters: q, x and ə for sounds that do not exist in Turkish. Although all the "Turkish letters" are collated in their "normal" alphabetical order like in Turkish, the three extra letters are collated arbitrarily after letters whose sounds approach theirs. So, q is collated just after k, x (pronounced like a German ''ch'') is collated just after h and ə (pronounced roughly like an English short ''a'') is collated just after e. * In [[Breton language|Breton]], there is no "c", "q", "x" but there are the digraphs "ch" and "c'h", which are collated between "b" and "d". For example: « buzhugenn, chug, c'hoar, daeraouenn » (earthworm, juice, sister, teardrop). * In [[Czech language|Czech]] and [[Slovak language|Slovak]], accented vowels have secondary collating weight – compared to other letters, they are treated as their unaccented forms (in Czech, A-Á, E-É-Ě, I-Í, O-Ó, U-Ú-Ů, Y-Ý, and in Slovak, A-Á-Ä, E-É, I-Í, O-Ó-Ô, U-Ú, Y-Ý), but then they are sorted after the unaccented letters (for example, the correct lexicographic order is baa, baá, báa, báá, bab, báb, bac, bác, bač, báč [in Czech] and baa, baá, baä, báa, báá, báä, bäa, bäá, bää, bab, báb, bäb, bac, bác, bäc, bač, báč, bäč [in Slovak]). Accented consonants have primary collating weight and are collated immediately after their unaccented counterparts, with exception of Ď, Ň and Ť (in Czech) and Ď, Ĺ, Ľ, Ň, Ŕ and Ť (in Slovak), which have again secondary weight. [[Ch (digraph)|CH]] is considered to be a separate letter and goes between [[H]] and [[I]]. In Slovak, [[Dz (digraph)|DZ]] and [[DŽ]] are also considered separate letters and are positioned between [[Ď]] and [[E]]. * In the [[Danish and Norwegian alphabet]]s, the same extra vowels as in Swedish (see below) are also present but in a different order and with different [[glyph]]s (..., X, Y, Z, [[Æ]], [[Ø]], [[Å]]). Also, "Aa" collates as an equivalent to "Å". The Danish alphabet has traditionally seen "W" as a variant of "V", but today "W" is considered a separate letter. * In [[Dutch language|Dutch]] the combination IJ (representing [[IJ (letter)|IJ]]) was formerly to be collated as Y (or sometimes as a separate letter: Y < IJ < Z), but is currently mostly collated as 2 letters (II < IJ < IK). Exceptions are phone directories; IJ is always collated as Y here because in many Dutch family names Y is used where modern spelling would require IJ. Note that a word starting with ij that is written with a capital I is also written with a capital J, for example, the town [[IJmuiden]], the river [[IJssel]] and the country IJsland ([[Iceland]]). * In [[Esperanto]], consonants with [[circumflex]] accents ([[c-circumflex|ĉ]], [[g-circumflex|ĝ]], [[h-circumflex|ĥ]], [[j-circumflex|ĵ]], [[s-circumflex|ŝ]]), as well as [[u-breve|ŭ]] (u with [[breve]]), are counted as separate letters and collated separately (c, ĉ, d, e, f, g, ĝ, h, ĥ, i, j, ĵ ... s, ŝ, t, u, ŭ, v, z). * In [[Estonian language|Estonian]] [[õ]], [[ä]], [[ö]] and [[ü]] are considered separate letters and collate after [[w]]. Letters [[š]], [[z]] and [[ž]] appear in loanwords and foreign proper names only and follow the letter [[s]] in the [[Estonian alphabet]], which otherwise does not differ from the basic Latin alphabet. * The [[Faroese alphabet]] also has some of the Danish, Norwegian, and Swedish extra letters, namely [[Æ]] and [[Ø]]. Furthermore, the [[Faroese alphabet]] uses the Icelandic eth, which follows the [[D]]. Five of the six vowels [[A]], [[I]], [[O]], [[U]] and [[Y]] can get accents and are after that considered separate letters. The consonants [[C]], [[Q]], [[X]], [[W]] and [[Z]] are not found. Therefore, the first five letters are [[A]], [[Á]], [[B]], [[D]] and [[Ð]], and the last five are [[V]], [[Y]], [[Ý]], [[Æ]], [[Ø]] * In [[Filipino language|Filipino]] (Tagalog) and other Philippine languages, the letter Ng is treated as a separate letter. It is pronounced as in ''sing'', ''ping-pong'', etc. By itself, it is pronounced ''nang'', but in general [[Filipino orthography]], it is spelled as if it were two separate letters (n and g). Also, letter derivatives (such as [[Ñ]]) immediately follow the base letter. Filipino also is written with diacritics, but their use is very rare (except the [[tilde]]). * The [[Finnish alphabet]] and collating rules are the same as those of Swedish. * For [[French language|French]], the ''last'' accent in a given word determines the order.<ref name=unicode10>{{cite web| title=Unicode Technical Standard #10: Unicode collation algorithm| publisher=Unicode, Inc. (unicode.org)| date=20 March 2008| url=https://unicode.org/reports/tr10/| access-date=27 August 2008| archive-date=27 August 2008| archive-url=https://web.archive.org/web/20080827003801/http://www.unicode.org/reports/tr10/| url-status=live}}</ref> For example, in French, the following four words would be sorted this way: cote < côte < coté < côté. The letter e is ordered as e é è ê ë (œ considered as oe), same thing for o as ô ö. * In [[German alphabet|German]] letters with [[Diaeresis (diacritic)|umlaut]] ([[Ä]], [[Ö]], [[Ü]]) are treated generally just like their non-umlauted versions; [[ß]] is always sorted as ss. This makes the alphabetic order Arbeit, Arg, Ärgerlich, Argument, Arm, Assistant, Aßlar, Assoziation. For phone directories and similar lists of names, the umlauts are to be collated like the letter combinations "ae", "oe", "ue" because a number of German surnames appear both with umlaut and in the non-umlauted form with "e" (Müller/Mueller). This makes the alphabetic order Udet, Übelacker, Uell, Ülle, Ueve, Üxküll, Uffenbach. * The [[Hungarian language|Hungarian]] vowels have accents, umlauts, and double accents, while consonants are written with single, double (digraphs) or triple (trigraph) characters. In collating, accented vowels are equivalent with their non-accented counterparts and double and triple characters follow their single originals. Hungarian alphabetic order is: A=Á, B, C, Cs, D, Dz, Dzs, E=É, F, G, Gy, H, I=Í, J, K, L, Ly, M, N, Ny, O=Ó, Ö=Ő, P, Q, R, S, Sz, T, Ty, U=Ú, Ü=Ű, V, W, X, Y, Z, Zs. (Before 1984, ''dz'' and ''dzs'' were not considered single letters for collation, but two letters each, d+z and d+zs instead.) It means that e.g. ''nádcukor'' should precede ''nádcsomó'' (even though ''s'' normally precedes ''u''), since ''c'' precedes ''cs'' in the collation. Difference in vowel length should only be taken into consideration if the two words are otherwise identical (e.g. ''egér, éger''). Spaces and hyphens within phrases are ignored in collation. ''Ch'' also occurs as a digraph in certain words but it is not considered as a grapheme on its own right in terms of collation. *:A particular feature of Hungarian collation is that contracted forms of double di- and trigraphs (such as {{lang|hu|ggy}} from ''gy + gy'' or {{lang|hu|ddzs}} from ''dzs + dzs'') should be collated as if they were written in full (independently of the fact of the contraction and the elements of the di- or trigraphs). For example, ''kaszinó'' should precede ''kassza'' (even though the fourth character ''z'' would normally come after ''s'' in the alphabet), because the fourth "character" ([[grapheme]]) of the word ''kassza'' is considered a second ''sz'' (decomposing ''ssz'' into ''sz + sz''), which does follow ''i'' (in ''kaszinó'').<!-- source: 14. c) of the Rules of Hungarian Orthography, cf. [[Hungarian orthography]] --> * In [[Icelandic language|Icelandic]], [[Þ]] is added, and D is followed by [[Ð]]. Each vowel (A, E, I, O, U, Y) is followed by its correspondent with [[Acute accent|acute]]: Á, É, Í, Ó, Ú, Ý. There is no Z, so the alphabet ends: ... X, Y, Ý, [[Þ]], [[Æ]], Ö. ** Both letters were also used by [[Anglo-Saxons|Anglo-Saxon]] scribes who also used the Runic letter [[Wynn]] to represent /w/. ** [[thorn (letter)|Þ]] (called thorn; lowercase þ) is also a Runic letter. ** [[Eth (letter)|Ð]] (called eth; lowercase ð) is the letter [[D]] with an added stroke. * [[Kiowa language|Kiowa]] is ordered on phonetic principles, like the [[Brahmic scripts]], rather than on the historical Latin order. Vowels come first, then stop consonants ordered from the front to the back of the mouth, and from negative to positive [[voice-onset time]], then the affricates, fricatives, liquids, and nasals: :: A, AU, E, I, O, U, B, F, P, V, D, J, T, TH, G, C, K, Q, CH, X, S, Z, L, Y, W, H, M, N * In [[Lithuanian language|Lithuanian]], specifically Lithuanian letters go after their Latin originals. Another change is that [[Y]] comes just before [[J]]: ... G, H, I, Į, Y, J, K... * In [[Maltese alphabet]] the digraphs GĦ and IE are treated as single letters, and each is listed after the first character of the pair. The dotted letters (Ċ Ġ Ż) are collated before their originals, while Ħ is after H. Accents, apostrophes and hyphens are ignored. However, when two words sort identically these diacritics are taken into consideration, such that accented letters follow non-accented. * In [[Polish language|Polish]], specifically Polish letters derived from the Latin alphabet are collated after their originals: A, Ą, B, C, Ć, D, E, Ę, ..., L, Ł, M, N, Ń, O, Ó, P, ..., S, Ś, T, ..., Z, Ź, Ż. The digraphs for collation purposes are treated as if they were two separate letters. * In [[Pinyin alphabetical order]], where words have the same basic letters in pinyin and differ only in modifying diacritics, the unmodified letter comes before the modified letter. For example, {{angbr|e}} comes before {{angbr|ê}} (額 (''è'') before 欸 (''ê̄'')), and {{angbr|u}} comes before and {{angbr|ü}} (路 (''lù'') before 驢 (''lǘ'') and 努 (''nǔ'') before 女 (''nǚ'')). Characters with the same pinyin letters (including modified letters {{angbr|ê}} and {{angbr|ü}}) are arranged according to their tones in the order of "first tone (i.e., "flat tone"), second tone (rising tone), third tone (falling-rising tone), fourth tone (falling tone), fifth tone (neutral tone)", for example "媽 (''mā''), 麻 (''má''), 馬 (''mǎ''), 罵 (''mà''), 嗎 (''ma'')".{{efn| There is an exception: In [[ABC Chinese–English Dictionary]] the tone order is "zero tone (neutral tone), first tone (flat tone), second tone (rising tone), third tone (falling-rising tone) and fourth tone (falling tone)".}} * In [[Portuguese alphabet|Portuguese]], the collating order is just like in English: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z. Digraphs and letters with diacritics are not included in the alphabet. * In [[Romanian language|Romanian]], special characters derived from the Latin alphabet are collated after their originals: A, Ă, Â, ..., I, Î, ..., S, Ș, T, Ț, ..., Z. * In [[Serbo-Croatian]] and other related South Slavic languages, the five accented characters and three conjoined characters are sorted after the originals: ..., C, Č, Ć, D, DŽ, Đ, E, ..., L, LJ, M, N, NJ, O, ..., S, Š, T, ..., Z, Ž. * [[Spanish alphabet|Spanish]] treated (until 1994) "CH" and "LL" as single letters, giving an ordering of ''{{Wikt-lang|es|cinco}}, {{Wikt-lang|es|credo}}, {{Wikt-lang|es|chispa}}'' and ''{{Wikt-lang|es|lomo}}, {{Wikt-lang|es|luz}}, {{Wikt-lang|es|llama}}.'' This is not true any more since in 1994 the [[Real Academia Española|RAE]] adopted the more conventional usage, and now LL is collated between LK and LM, and CH between CG and CI. The six characters with diacritics Á, É, Í, Ó, Ú, Ü are treated as the original letters A, E, I, O, U, for example: ''{{Wikt-lang|es|radio}}, {{Wikt-lang|es|ráfaga}}, {{Wikt-lang|es|rana}}, {{Wikt-lang|es|rápido}}, {{Wikt-lang|es|rastrillo}}.'' The only Spanish-specific collating question is [[Ñ]] ({{Wikt-lang|es|eñe}}) as a different letter collated after N. * In the [[Swedish alphabet]], there are three extra [[vowel]]s placed at its end (..., X, Y, Z, [[Å]], [[Ä]], [[Ö]]), similar to the Danish and Norwegian alphabet, but with different glyphs and a different collating order. The letter "W" has been treated as a variant of "V", but in the 13th edition of ''[[Svenska Akademiens Ordlista|Svenska Akademiens ordlista]]'' (2006) "W" was considered a separate letter. * In the [[Turkish alphabet]] there are six additional letters: ç, ğ, ı, ö, ş, and ü (but no q, w, and x). They are collated with ç after c, ğ after g, ı ''before'' i, ö after o, ş after s, and ü after u. Originally, when the alphabet was introduced in 1928, ı was collated after i, but the order was changed later so that letters having shapes containing dots, cedilles or other adorning marks always follow the letters with corresponding bare shapes. Note that in Turkish orthography the letter I is the majuscule of dotless ı, whereas İ is the majuscule of dotted i. * In many [[Turkic languages]] (such as [[Azeri language|Azeri]] or the [[Yañalif|Jaꞑalif]] orthography for [[Tatar language|Tatar]]), there used to be the letter [[Gha]] (Ƣƣ), which came between [[G]] and [[H]]. It is now in disuse. * In [[Vietnamese language|Vietnamese]], there are seven additional letters: [[ă]], [[â]], [[đ]], [[ê]], [[ô]], [[ơ]], [[ư]] while [[f]], [[j]], [[w]], [[z]] are absent, even though they are still in some use (like Internet address, foreign loan language). "f" is replaced by the combination "ph". The same as for "w" is "qu". * In [[Volapük]] [[ä]], [[ö]] and [[ü]] are counted as separate letters and collated separately (a, ä, b ... o, ö, p ... u, ü, v) while [[q]] and [[w]] are absent.<ref>{{Cite web |last=Midgley |first=Ralph |title=Volapük to English dictionary |url=http://volap%C3%BCk.com/VoEnDictionary-20100830.pdf |archive-url=https://web.archive.org/web/20120901034151/http://xn--volapk-7ya.com/VoEnDictionary-20100830.pdf |archive-date=1 September 2012 |url-status=dead |df=dmy |access-date=24 September 2019 }}</ref> * In [[Welsh language|Welsh]] the digraphs CH, DD, FF, NG, LL, PH, RH, and TH are treated as single letters, and each is listed after the first character of the pair (except for NG which is listed after G), producing the order A, B, C, CH, D, DD, E, F, FF, G, NG, H, and so on. It can sometimes happen, however, that word compounding results in the juxtaposition of two letters which do ''not'' form a digraph. An example is the word LLONGYFARCH (composed from LLON + GYFARCH). This results in such an ordering as, for example, LAWR, LWCUS, LLONG, LLOM, LLONGYFARCH (NG is a digraph in LLONG, but not in LLONGYFARCH). The letter combination R+H (as distinct from the digraph RH) may similarly arise by juxtaposition in compounds, although this tends not to produce any pairs in which misidentification could affect the ordering. For the other potentially confusing letter combinations that may occur – namely, D+D and L+L – a hyphen is used in the spelling (e.g. AD-DAL, CHWIL-LYS).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)