Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Incubator escapee wiki:Language recognition chart
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Latin alphabet (possibly extended)== ===[[Romance languages]]=== Lots of [[Latin]] roots. ====[[French language|French]] ([[:fr:Français|Français]])==== * Accented letters: ''â ç è é ê î ô û'', rarely ''ë ï'' ; ''ù'' only in the word ''où'', ''à'' only at the ends of a few words (including ''à''). Never ''á í ì ó ò ú''. * Angle quotation marks: « » (though "curly-Q" quotation marks are also used); dialogue traditionally indicated by means of dashes. * Common short words: ''la'', ''le'', ''les'', ''un'', ''une'', ''des'', ''de'', ''du'', ''à'', ''au'', ''et'', ''ou'', ''où'', ''sur'', ''il'', ''elle'', ''ils'', ''se'', ''je'', ''vous'', ''que'', ''qui'', ''y'', ''en'', ''si'', ''ne'', ''est'', ''sont'', ''a'', ''ont''. * Many apostrophised contractions for common pronouns and particles, i.e. words ''l{{'}}'' or ''d{{'}}'', less often ''c{{'}}'', ''j{{'}}'', ''m{{'}}'', ''n{{'}}'', ''s{{'}}'', ''t{{'}}'', or rarely ''z{{'}}'' — only before a word starting by a vowel or, in some cases, an ''h''. * Common digraphs and trigraphs: ** Vowels digraphs: ''au'', ''ai'', ''ei'', ''ou''. Word-final ''-ez''. ** Vowels digraphs (nasals): ''an'', ''en'', ''in'', ''on'', rarely ''un''. For all of these, the ''n'' become ''m'' before ''b'', ''p'' or ''m'' (e.g. ''embouchure'', never *''enbouchure''). ** Vowel trigraphs: ''eau'', ''ein'', ''ain'', ''oin''. ** Consonant digraphs: ''ch'', ''gu-''. Rarely ''sh''. Semi-consonant ''-ill-''. * Letters ''w'' and ''k'', are rare and used only in loanwords, most often from Germanic languages (e.g ''whisky''). * Ligatures ''œ'' and ''æ'' are conventional but are rarely used (a few words are well known, e.g. ''œil'', ''œuf(s)'', ''bœuf(s)'', most other are scientific/technical and borrowed from Latin). * Words ending in ''-aux'', ''-eux'', or ''-oux''. ====[[Spanish language|Spanish]] ([[:es:Idioma español|Español]])==== *Characters: ¿ ¡ (inverted question and exclamation marks), ñ *All vowels (á, é, í, ó, ú) may take an acute accent *The letter ''u'' can take a diaeresis (ü), but only after the letter g *Some words frequently used: de, el, del, los, la(s), uno(s), una(s), y *No apostrophised contractions *No use of grave accent *Letters ''k'' and ''w'' are rare and only used in loanwords (e.g. ''walkman'') *Word beginnings: ll- (check not Welsh or Catalan) double L (ll) *Word endings: -o, -a, -ción, -miento, -dad *Angle quotation marks: « » (though "curly-Q" quotation marks are also used); dialogue often indicated by means of dashes ====[[Italian language|Italian]] ([[:it:Lingua italiana|Italiano]])==== *Almost every native word ends in a vowel. Example exceptions include ''non'', ''il'', ''per'', ''con'', ''del''. *Common one-letter word: ''è''. *Common word: ''perché''. *Letter sequences: ''gli'', ''gn'', ''sci''. *Letters ''j'', ''k'', ''w'', ''x'' and ''y'' are rare and used only in loanwords (e.g. ''whisky''). *Word endings: ''-o'', ''-a'', ''-zione'', ''-mento'', ''-tà'', ''-aggio''. *Grave accent (e.g., on à) almost always occurs in the last letter of words. *Double consonants (''tt'', ''zz'', ''cc'', ''ss'', ''bb'', ''pp'', ''ll'', etc.) are frequent. ====[[Catalan language|Catalan]] ([[:ca:Català|Català]])==== *Characters: à, è, é, í, ï, ò, ó, ú, ü, ç, · *Character combination ''tz'' (also common in Basque, however) and ''l·l'' *Syllables and words ending in ''-aig'', ''-eig'', ''-oig'', ''-uig'', ''-aix'', ''-eix'', ''-oix'', ''-uix'' *Letter sequences: ''tx'' (also common in Basque, however) and ''tg'' *Letter ''y'' is only used in the combination ''ny'' and loanwords *Letters ''k'' and ''w'' are rare and only used in loanwords (e.g. ''walkman'') *Word endings: ''-o'', ''-a'', ''-es'', ''-ció'', ''-tat'', ''-ment'' *Word beginning: ''ll-'' (also common in Spanish and Welsh, however) *Common words: ''això'', ''amb'', ''mateix'', ''tots'', ''que'' ====[[Romanian language|Romanian]] ([[:ro:Limba română|Română]])==== *Characters: ă â î ș ț *Common words: și, de, la, a, ai, ale, alor, cu *Word endings: -a, -ă, -u, -ul, -ului, -ție (or -țiune), -ment, -tate; names ending in -escu *Double and triple i: copii, copiii *Note that Romanian is sometimes written online with no diacritics, making it harder to identify. A cedilla is sometimes used on S (ş) and on T (ţ) instead of the correct diacritic, the comma (above). ====[[Portuguese language|Portuguese]] ([[:pt:Língua portuguesa|Português]])==== *Characters: ã, õ, â, ê, ô, á, é, í, ó, ú, à, ç *Common one-letter words: a, à, e, é, o *Common two-letter words: ao, as, às, da, de, do, em, os, ou, um *Common three-letter words: aos, com, das, dos, ele, ela, mas, não, por, que, são, uma *Common endings: -ção, -dade, -ismo, -mente *Common digraphs: ch, nh, lh; examples: chave, galinha, baralho. *The letters k, w and y are rare. They are found mostly in loanwords, e.g.: ''keynesianismo'', ''walkie-talkie'', ''nylon''. *Most singular words end in a vowel, l, m, r, or z. *Plural words end in -s. ====[[Walloon language|Walloon]] ([[:wa:Walon|Walon]])==== *Characters: å, é, è, ê, î, ô, û *Common digraphs and trigraphs: ai, ae, én, -jh-, tch, oe, -nn-, -nnm-, xh, ou *Common one-letter words: a, å, e, i, t', l', s', k' *Common two-letter words: al, ås, li, el, vs, ki, si, pô, pa, po, ni, èn, dj' *Common three-letter words: dji, nén, rén, bén, pol, mel *Common endings: -aedje, -mint, -xhmint, -ès, -ou, -owe, -yî, -åcion *Apostrophes are followed by a space (preferably non breaking one), eg: ''l' ome'' instead of ''l'ome''. ====[[Galician language|Galician]] ([[:gl:Lingua galega|Galego]])==== *Similar to Portuguese; the indefinite article "unha" (fem. plural), the suffix -ción and a heavier usage of the letter "x" usually sign Galician. *Definite articles o (masc. sing.), os (masc. plural), a (fem. sing.), as (fem. plural) *Common diagraphs: nh (ningunha) *The letters j, k, w and y are not in the alphabet, and appear only in loanwords ===[[Germanic languages]]=== ====[[English language|English]]==== *words: ''a'', ''an'', ''and'', ''in'', ''of'', ''on'', ''the'', ''that'', ''to'', ''is'', ''what'', ''I'' (''I'' is always capital when talking about oneself) *letter sequences: ''th'', ''ch'', ''sh'', ''wh'', ''ough'', ''augh'', ''qu'' *word endings: ''-ing'', ''-tion'', ''-ed'', ''-age'', ''-s'', ''-’s'', ''-’ve'', ''-n’t'', ''-’d'' *vast majority of words end with a consonant, or sometimes with an e. Some common exceptions: ''who,'' ''to,'' ''so,'' ''no,'' ''do,'' ''a,'' and a few names like ''Julia.'' *diacritics or accents only in loanwords (piñata) ====[[Dutch language|Dutch]] ([[:nl:Nederlands|Nederlands]])==== *letter sequences ''ij'' (capitalized as ''IJ'', and also found as a ligature, ''IJ'' or ''ij''), ''ei'', ''ou'', ''au'', ''oe'', doubled vowels (but not ''ii''), ''kw'', ''ch'', ''sch'', ''oei'', ''ooi'', ''aai'' and ''uw'' (especially ''eeuw'', ''ieuw'', ''auw'', and ''ouw''). *all consonants, except ''h'', ''j'', ''q'', ''v'', ''w'', ''x'' and ''z'' can be doubled. *the letters ''c'' (except in the sequence ''(s)ch''), ''q'', ''x'' and ''y'' are almost only found in loanwords. *words: ''het, op, en, een, voor'' (and compounds of ''voor''). *word endings: ''-tje'', ''-sje'', ''-ing'', ''-en'', ''-lijk'', *at the start of words: ''z-, v-, ge-'' *''t/m'' occasionally occurs between two points in time or between numbers (e.g. house numbers). ====[[West Frisian language|West Frisian]] ([[:fy:Frysk|Frysk]])==== *letter sequences: ''ij'', ''ei'', ''oa'' *words: yn ====[[Afrikaans language|Afrikaans]] ([[:af:Afrikaans (taal)|Afrikaans]])==== *Words: ''<nowiki>'n</nowiki>'', ''as'', ''vir'', ''nie''. *Similar to [[#Dutch (Nederlands)|Dutch]], but: **the common Dutch letters ''c'' and ''z'' are rare and used only in loanwords (e.g. ''chalet''); **the common Dutch vowel ''ij'' is not used; instead, ''i'' and ''y'' are used (e.g. ''-lik'', ''sy''); **the common Dutch word ending ''-en'' is rare, being replaced by ''-e''. ====[[German language|German]] ([[:de:Deutsche Sprache|Deutsch]])==== *umlauts (ä, ö, ü), ess-zett (ß) *letter sequences: ''ch'', ''ck'', ''sch'', ''tsch'', ''tz'', ''ss'', *common words: ''der'', ''die'', ''das'', ''den'', ''dem'', ''des'', ''er'', ''sie'', ''es'', ''ist'', ''ich'', ''du'', ''aber'' *common endings: ''-en'', ''-er'', ''-ern'', ''-st'', ''-ung'', ''-chen'', ''-tät'' *rare letters: ''x'', ''y'' (except in loanwords) *letter ''c'' rarely used except in the sequences listed above and in loanwords *long compound words *a period (.) after ordinal numbers, e.g. ''3. Oktober'' *many capitalised words in the middle of sentences since German capitalizes all nouns. ====[[Swedish language|Swedish]] ([[:sv:Svenska|Svenska]])==== *letters å, ä, ö, rarely é *common words: ''och'', ''i'', ''att'', ''det'', ''en'', ''som'', ''är'', ''av'', ''den'', ''på'', ''om'', ''inte'', ''men'' *common endings: ''-ning'', ''-lig'', ''-isk'', ''-ande'', ''-ade'', ''-era'', ''-rna'' *common surname endings: ''-sson'', ''-berg'', ''-borg'', ''-gren'', ''-lund'', ''-lind'', ''-ström'', ''-kvist/qvist/quist'' *long compound words *letter sequences: ''stj'', ''sj'', ''skj'', ''tj'', ''ck'', ''än'' *no use of characters ''w'', ''z'' except for foreign proper nouns and some loanwords but ''x'' is used, unlike Danish and Norwegian, which replace it with ''ks'' *doubling of consonants common, but doubling of vowels very rare ====[[Danish language|Danish]] ([[:Da:Dansk|Dansk]])==== *letters æ, ø, å *common words: ''af, og, til, er, på, med, det, den''; *common endings: ''-tion'', ''-ing'', ''-else'', ''-hed''; *long compound words; *no use of character ''q'', ''w'', ''x'' and ''z'' except for foreign proper nouns and some loanwords; *to distinguish from Norwegian: uses letter combination ''øj''; frequent use of ''æ''; spellings of borrowed foreign words are retained (in particular use of ''c''), such as ''centralstation''. *doubling of consonants common (but not at the end of words, unlike Norwegian and Swedish), but doubling of vowels very rare *pre-1948 orthography: ''aa'' was used instead of ''å''; all nouns were capitalized ====[[Norwegian language|Norwegian]] ([[:no:Norsk|Norsk]])==== *letters æ, ø, å *common words: ''av, ble, er, og, en, et, men, i, å, for, eller''; *common endings: ''-sjon'', ''-ing'', ''-else'', ''-het''; *long compound words; *no use of character ''c'', ''w'', ''z'' and ''x'' except for foreign proper nouns and some loanwords; *two versions of the language: [[Bokmål]] (much closer to Danish) and [[Nynorsk]] – for example ''ikke, lørdag, Norge'' (Bokmål) vs. ''ikkje, laurdag, Noreg'' (Nynorsk); Nynorsk uses the word ''òg''; printed materials almost always published in Bokmål only; *to distinguish from Danish: uses letter combination ''øy''; less frequent use of ''æ'' (mainly but not exclusively before ''r''); spellings of borrowed foreign words are ‘Norsified’ (in particular removing use of ''c''), such as ''sentralstasjon''. *doubling of consonants common (including the end of words), but doubling of vowels very rare ====[[Icelandic language|Icelandic]] ([[:is:Íslenska|Íslenska]])==== *letters ''á, ð, é, í, ó, ú, ý, þ, æ, ö'' *common beginnings: ''fj-'', ''gj-'', ''hj-'', ''hl-'', ''hr-'', ''hv-'', ''kj-'', and ''sj-'', *common endings: ''-ar'' (especially ''-nar''), ''-ir'' (especially ''-nir''), ''-ur'', ''-nn'' (especially ''-inn'') *no use of character ''c'', ''q'', ''w'', or ''z'' except for foreign proper nouns, some loanwords, and, in the case of ''z'', older texts. *doubling of consonants common, but doubling of vowels very rare ====[[Faroese language|Faroese]] ([[:fo:Føroyskt|Føroyskt]])==== *letters ''á, ð, í, ó, ú, ý, æ, ø'' *letter combinations: ''ggj'', ''oy'', ''skt'' *to distinguish from Icelandic: does not use é or þ, uses ø instead of ö (occasionally rendered as ö on road signs, or even ő). *doubling of consonants common, but doubling of vowels very rare ===[[Baltic languages]]=== ==== [[Latvian language|Latvian]] ([[:lv:Latviešu valoda|Latviešu]])==== *uses [[diacritics]]: ā, č, ē, ģ, ī, ķ, ļ, ņ, ō, ŗ, š, ū, ž * no use of character ''q'', ''w'', ''x'', or ''y'' except for foreign brand names, international symbols, some loanwords (e.g. ''{{lang|lv|queer}}''), and, in the case of ''w'', older texts. *no longer uses ō or ŗ in modern language *extremely rare doubling of [[vowels]] *rare doubling of [[consonants]] *a period (.) after ordinal numbers, e.g. ''2005. gads'' *common words: ''ir'', ''bija'', ''tika'', ''es'', ''viņš'' ==== [[Lithuanian language|Lithuanian]] ([[:lt:Lietuvių kalba|Lietuvių]])==== *visual abundance of letters ą, č, ę, ė, į, š, ų, ū, ų *does not have letters q, w, x *extremely rare doubling of [[vowel]]s and [[consonant]]s *many varying forms (usually endings) of the same word, e.g. namas, namo, namus, namams, etc. *generally long words (absence of articles and fewer prepositions in comparison to Germanic languages) *common words: ''ir'', ''yra'', ''kad'', ''bet''. ===[[Slavic languages]]=== ====[[Polish language|Polish]] ([[:pl:Język polski|Polski]])==== *consonant clusters ''rz, sz, cz, prz, trz'' *includes: ą, ę, ć, ś, ł, ń, ó, ż, ź *words ''w, z, we, i, na'' (several one-letter words) *words ''jest, się'' *words beginning with ''był, będzie, jest'' (forms of [[copula (linguistics)|copula]] ''być'', "to be"). ====[[Czech language|Czech]] ([[:cs:Čeština|Čeština]])==== *visual abundance of letters ''ž š ů ě ř'' *words ''je, v'' *to distinguish from Slovak: does not use ä, ľ, ĺ, ŕ or ô; ú only appears at the beginning of words. ====[[Slovak language|Slovak]] ([[:sk:Slovenčina|Slovenčina]])==== *visual abundance of letters ''ž š č''; *uses: ä, ľ, and ô and (very rarely) ĺ and ŕ; *typical suffixes: ''-cia'', ''-ť''; *to distinguish from Czech: does not use ě, ř or ů. ====[[Croatian language|Croatian]] ([[:hr:Hrvatski jezik|Hrvatski]])==== *similar to Serbian *letters-digraphs ''dž, lj, nj'' *does not have q, w, x, y *typical suffixes: ''-ti'', ''-ći'' *special letters: č, ć, š, ž, đ *common words: a, i, u, je *to distinguish from Serbian: sequences ''-ije-'' and ''-je-'' are common; verbs ending in ''-irati'', ''-iran'' ====[[Serbian language|Serbian]] ([[:sr:Српски језик|Srpski/Српски]])==== =====[[Serbian Latin alphabet|Serbian Latin]]===== *similar to Croatian *letters-digraphs ''dž, lj, nj'' (lj and nj are somewhat more common than dž, although not by much) *no q, w, x, y *typical verb suffixes ''-ti'', ''-ći'' (infinitive is much less used than in Croatian) *foreign words might end in ''-tija'', ''-ovan'', ''-ovati'', ''-uje'' *special letters: đ (rare), č, š (common), ć, ž (less common) *common words: a, i, u, je, jeste *[[future tense]] suffix ''-iće'', ''-ićeš'', ''-ićemo'', ''-ićete'' (not found in Croatian) *vowel sequences ''-ije-'' and ''-je-'' are very often in Serbian that is spoken in Bosnia and Herzegovina, Montenegro and Croatia (ijekavica), but it does not appear in Serbia because each of those sequences are substituted with ''-e-'' (ekavica). =====[[Serbian Cyrillic]]===== *uses Џ, Ј, Љ, Њ, Ђ, Ћ *does not use Щ, Ъ, Ы, Ь, Э, Ю, Я, Ё, Є, Ґ, Ї, І, Ў *to distinguish from Macedonian: does not use Ѕ, Ѓ, Ќ ===[[Celtic languages]]=== ====[[Welsh language|Welsh]] ([[:cy:Cymraeg|Cymraeg]])==== *letters ''Ŵ, ŵ'' used in Welsh *words ''y, yr, yn, a, ac, i, o'' *letter sequences ''wy, ch, dd, ff, ll, mh, ngh, nh, ph, rh, th, si'' *letters not used: ''k, q, v, x, z'' *letter only used rarely, in loanwords: ''j'' *commonly accented letters: ''â, ê, î, ô, û, ŵ, ŷ'', although acute (''´''), grave (''`''), and dieresis (''¨'') accents can hypothetically occur on all vowels *word endings: ''-ion, -au, -wr, -wyr'' *''y'' is the most common letter in the language *''w'' between consonants (''w'' in fact represents a vowel in the Welsh language) *circumflex accent (''^'') is by far the commonest diacritical mark, although diacritics are often omitted altogether ====[[Irish language|Irish]] ([[:ga:Gaeilge|Gaeilge]])==== *vowels with acute accents: ''á é í ó ú'' *words beginning with letter sequences ''bp dt gc bhf'' *letter sequences ''sc cht'' *no use of the letter J, K, Q, V, W. *frequent bh, ch, dh, fh, gh, mh, th, sh *to distinguish from (Scottish) Gaelic: there may be words or names with the second (or even third) letter capitalized instead of the first: ''hÉireann''. ====[[Scottish Gaelic language|Scottish Gaelic]] ([[:gd:Gàidhlig|Gàidhlig]])==== *vowels with grave accents: ''à è ì ò ù'' (''é'' and ''ó'' still occasionally seen but usage is now discouraged) *letter sequences ''sg chd'' *frequent bh, ch, dh, fh, gh, mh, th, sh *to distinguish from Irish: prefixes are hyphenated, so capitals in the middle of words generally do not occur: ''an t-Oban''. ===[[Albanian language|Albanian]] ([[:sq:Shqip|Shqip]])=== *unique letters: ''ë'', ''ç''. *''ë'' is the most common letter in the language. *the letter ''w'' is not used except in loanwords. *''dh'', ''gj'', ''ll'', ''nj'', ''rr'', ''sh'', ''th'', ''xh'', and ''zh'' are considered one letter instead of two. *common words: po, jo, dhe, i, të, me ===[[Maltese language|Maltese]] ([[:mt:Lingwa Maltija|Malti]])=== *unique letters: ċ, ġ, ħ, għ, ħ, ż *semitic origin, fairly intelligible with Arabic *uses il-xxx for the definite article ===[[Iranian languages]]=== ====[[Kurdish language|Kurdish]] ([[:ku:Zimanê kurdî|Kurdî / كوردی]])==== *uses circumflex ( ^ ): ê, î, û and cedilla ( ¸ ): ç, ş *the word ''xwe'' (oneself, myself, yourself etc.) appears frequently and is highly specific (''xw'' combination) *( I, i ) is the most common letter in the language *uses eight vowels (a, e, ê, i, î, o, u, û) *impossible to find a word without any vowel *has lots of compound words ===[[Finno-Ugric languages]]=== ==== [[Finnish language|Finnish]] ([[:fi:Suomen kieli|Suomi]])==== *distinct letters ''å'', ''ä'' and ''ö''; but never ''õ'' or ''ü'' (''y'' takes the place of ''ü'') *''b'', ''f'', ''z'', ''š'' and ''ž'' appear in [[loanwords]] and [[proper names]] only; the last two are substituted with ''sh'' or ''zh'' in some texts *''c'', ''q'', ''w'', ''x'', ''å'' appear in (typically foreign) proper names only *outside of loanwords, ''d'' appears only between vowels or in ''hd'' *outside of loanwords, ''g'' only appears in ''ng'' *outside of loanwords, words do not begin with two consonants; this is reflected in the general syllable structure, where consonant clusters only occur across syllable boundaries, except in some loanwords *common words: ''sinä'', ''on'' *common endings: ''-nen'', ''-ka''/''-kä'', ''-in'', ''-t'' (plural suffix) *common vowel combinations: ''ai'', ''uo'', ''ei'', ''ie'', ''oi'', ''yö'', ''äi'' *unusually high degree of letter duplication, both vowels and consonants will be geminated, for example ''aa'', ''ee'', ''ii'', ''kk'', ''ll'', ''ss'', ''yy'', ''ää'' *frequent long words ==== [[Estonian language|Estonian]] ([[:et:Eesti keel|Eesti]])==== *distinct letters: ''õ'', ''ä'', ''ö'' and ''ü''; but never ''ß'' or ''å'' *similar to Finnish, except: **letter ''y'' is not used, except in loanwords (''ü'' is the corresponding vowel) **letters ''b'' and ''g'' (without preceding ''n'') are found outside of loanwords **occasional use of ''š'' and ''ž'', mainly in loanwords (plus combination ''tš'') **loanwords more common generally than in Finnish, mainly loaned from German **words end in consonants more frequently than in Finnish, word-final ''b'', ''d'', ''v'' being particularly typical **letter ''d'' is much more common in Estonian than in Finnish, and in Estonian it is often the last letter of the word (plural suffix), which it never is in Finnish **double ''öö'' more common than in Finnish; other doubles can include ''õõ'', ''üü'', rarely ''hh'' (for German ''ch'') and even ''šš'' *common words: ''ja'', ''on'', ''ei'', ''ta'', ''see'', ''või''. ==== [[Hungarian language|Hungarian]] ([[:hu:Magyar nyelv|Magyar]])==== *letters ő and ű ([[double acute accent]]) unique to Hungarian *accented letters ''á'' and ''é'' frequent *letter combinations: ''cs, dz, dzs, gy, ly, ny, sz, ty, zs'' (all classed as separate letters), ''leg‐, ‐obb'' (note: ''sz'' also common in [[#Polish_.28Polski.29|Polish]]) *common words: ''a, az, ez, egy, és, van, hogy'' *letter ''k'' very frequent (plural suffix) *letter ''q'' extremely ''in''frequent (no use of the letter aside from clearly foreign words and a few proper names) ===[[Eskimo–Aleut languages]]=== ====[[Greenlandic language|Greenlandic]] ([[:kl:Kalaallisut|Kalaallisut]])==== *long polysynthetic words (a single word can number 30+ letters) *relatively abundant ''n'', ''q'' (not necessarily followed by ''u''), ''u'' *ubiquitous double consonants and vowels (''aa'', ''ii'', ''qq'', ''uu'', more rarely ''ee'', ''oo'') *vowels ''a'', ''i'', ''u'' conspicuously more frequent than ''e'', ''o'' (which are only found before ''q'' and ''r'') *no diphthongs except occasional word-final ''ai'', only consonant combinations besides double consonants and ''(n)ng'' consist of ''r'' + consonant *old spellings (now abolished in the spelling reform of 1973) sometimes included acute accent, circumflex, tilde, and/or the letter [[kra]] (''Kʼ ĸ''): ''Kʼânâĸ'' vs. ''Qaanaaq''. ===[[Southern Athabaskan languages]]=== *vowels with acute accent, [[ogonek]] (nasal hook), or both: á, ą, ą́ *doubled vowels: aa, áá, ąą, ą́ą́ *slashed ''l'': ł (check not Polish!) *''n'' with acute accent: ń *quotation mark: ' or ’ *sequences: dl, tł, tł’, dz, ts’, ií, áa, aá *may have rather long words ====[[Navajo language|Navajo]] ([[:nv:Diné bizaad|Diné bizaad]])==== In addition to the above, *does <u>not</u> use ''u'', ''ú'', or ''ų'' ====([[Mescalero-Chiricahua language|Mescalero / Chiricahua]]) (Mashgaléń / Chidikáágo)==== In addition to the above, *uses: u, ú, ų *does <u>not</u> use ''o'', ''ó'', or ''ǫ'' ===[[Guarani language|Guaraní]]=== *lots of tildes over vowels (including y) and n *tilde over g: g̃—it's the only language in the world to use it. Example words: ''hagũa'' and ''g̃uahẽ''. *b, d, and g usually do not occur without m or n before (mb, nd, ng) unless they're Spanish loan words. *f, l, q, w, x, z extremely rare outside loan words *does not use c without h: ch === [[Japanese language|Japanese]] in [[Romaji]] ([[:ja:日本語|Nihongo/日本語]])=== *words: ''desu, aru, suru'', esp. at end of sentences; *word endings: ''-masu, -masen, -shita''; *letters: Japanese almost always alternates between a consonant and a vowel. Exceptions are [[digraph]]s ''shi'' and ''chi'', [[affricate]] ''tsu'', [[gemination]] (two of the same consonant in a row) and [[Palatalization (phonetics)|palatalization]] (a consonant followed by the letter ''y''). *a macron or circumflex may be used to indicate doubled vowels, eg. ''Tōkyō'' *common words: ''no, o, wa, de, ni'' (Note: Romaji is not often used in Japanese script. It is most often used for foreigners learning the pronunciation of the Japanese language.) ===[[Hmong language|Hmong]] ([[:hmn:Hmoob|Hmoob]]) written in [[Romanized Popular Alphabet]]=== *Almost all written words are quite short (one syllable). *Syllables (unless they are pronounced with mid tone) end in a tone letter: one of ''b s j v m g d'', leading to apparent "consonant clusters" such as ''-wj'' *'''w''' can be the main vowel of a syllable (e.g. ''tswv'') *Syllables can begin with sequences such as ''hm-, ntxh-, nq-''. *Syllables ending in double vowels (especially ''-oo, -ee'') possibly followed by a tone letters (as in ''Hmoob'' "Hmong"). ===[[Vietnamese language|Vietnamese]] ([[:vi:Tiếng_Việt|tiếng Việt]])=== *Roman characters with more than one diacritical mark on the same vowel. See [[#Characters|above]]. *Almost all written words are quite short (one syllable, mostly less than six characters long). *Words beginning with ''ng'' or ''ngh'' *Words ending with ''nh'' *common words: ''cái, không, có, ở, của, và, tại, với, để, đã, sẽ, đang, tôi, bạn, chúng, là'' ====Vietnamese Quoted-Readable ([[VIQR]])==== *The following characters (often in combination) after vowels: ^ ( + ' ` ? ~ . *DD, Dd, or dd *The following character before punctuation: \ ====Vietnamese [[VNI]] encoding==== *The digits 1-8 after vowels *The digit 9 after a D or d *The following character before numbers: \ ====Vietnamese [[Telex_(IME)|Telex]]==== *The following characters after vowels: s f r x j *The following vowels, doubled up: a e o *The letter ''w'' after the following characters: a o u *DD, Dd, or dd ===Chinese, Romanized=== ====[[Standard Mandarin|Standard Mandarin]] ([[:zh:現代標準漢語|現代標準漢語]])==== *In general, Mandarin syllables end only in vowels or n, ng, r; never in p, t, k, m =====[[Pinyin]]===== *Words beginning with x, q, zh *Tone marks on vowels, such as ā, á, ǎ, à **For convenience while using a computer, these are sometimes substituted with numbers, e.g. a1, a2, a3, a4 =====[[Wade–Giles]]===== *Words do not begin with ''b, d, g, z, q, x, r'' *Words beginning with ''hs'' *Many hyphenated words *Apostrophes after initial letters or digraphs, e.g. ''t'a, ch'i'' =====[[Gwoyeu Romatzyh]]===== *Many unusual vowel combinations such as ae, eei, ii, iee, oou, yy, etc. *Insertion of r, e.g. arn, erng, etc. *Words ending in nn, nq ==== [[Southern_Min|Southern Min / Min-Nan]] ([[:zh-min-nan:Bân-lâm-gú|Bân-lâm-gí/Bân-lâm-gú]]) in [[Pe̍h-ōe-jī]] ==== *Many hyphenated words. *Words can end in p, t, k, m, n, ng, h; never r *Roman characters with many diacritical marks on vowels. Unlike Vietnamese, each character has at most one such mark. *Unusual combining characters, namely · (middle dot, always after ''o'') and | (vertical bar). ¯ ([[Macron (diacritic)|macron]]) is also common. === [[Austronesian languages]] === ====[[Malay language|Malay]] ([[:ms:Bahasa_Melayu|bahasa Melayu]]) and [[Indonesian language|Indonesian]] ([[:id:Bahasa_Indonesia|bahasa Indonesia]])==== May contain the following:<br> Prefixes: ''me-, mem-, memper-, pe-, per-, di-, ke-''<br> Suffixes: ''-kan, -an, -i''<br> Others (these almost always written in lowercase): ''yang, dan, di, ke, oleh, itu''<br> [[Malay language|Malay]] and [[Indonesian language|Indonesian]] are mutually intelligible to proficient speakers, although translators and interpreters will generally be specialists in one or other language. See [[Comparison of Standard Malay and Indonesian]]. Frequent use of the letter 'a' (comparable to the frequency of the English 'e'). ==== [[Polynesian languages]] ==== Most Polynesian languages use A E F G H I K L M N O P R S T U V and [[ʻOkina|ʻ]] (sometimes written ' or Q) ** L : Nuclear Polynesian languages ([[Tongan language|Tongan]], [[Samoan language|Samoan]], [[Tuvaluan language|Tuvaluan]], [[Tokelauan language|Tokelauan]]...) as in ''fale'' ** R : Eastern Polynesian languages ([[Māori language|NZ Māori]], [[Tahitian language|Tahitian]], [[Cook Islands Māori]], [[Rapa Nui language|Rapa Nui]]...) as in ''fare'' ** K : most Polynesian languages except [[Hawaiian language|Hawaiian]], Samoan, Tahitian ** H : most Polynesian languages except Samoan ** WH : NZ Māori (''whenua'') *Consonants always separated by one or more vowels (''fenua'', ''Haʻapai'', ''ʻolelo'') *Short and long vowels, written either with a macron (āēīōū) or by replication (aa, ee, ii, oo, uu) *Frequent diphtongs (''oiaue'', ''māori'') *Words always end with a vowel *Loanwords are translitterated (like in Japanese): ''Sesu Kilisito''=Jesus Christ, ''polokalama''=program) *Frequent English or French loanwords (depending on colonial history) ===== [[Tongan language|Tongan]] (lea fakatonga) ===== * A E F H I K L M N NG O P S T U V ʻ *ng (''Tonga''), h, endings in -onua (''fonua'') *article ''te'' *frequent words: 'o, te, ki, mei, i, faka- *English loanwords ===== [[Samoan language|Samoan]] (gagana samoa) ===== * A E F G I L M N O P S T U V ʻ *no K letter, uses okina (ʻ) or nothing instead (''faka'' in Tongan is ''faʻa'' in Samoan) *frequent use of L (''le'') *frequent words: ''o'', ''e'', ''le'', ''se'', ''a'', ''i'', ''ma'' ===== [[Wallisian]] (lea faka'uvea) ===== * A E F G H I K L M N O P S T U V ʻ *distinguish from Tongan: g instead of ng (''tokaga'') *article ''te'' *h is more frequent than s (''tahi'') *frequent words: ko, te, ki, mai, i, o, ne'e, e, mo, faka- *French loanwords ===== [[East Futunan]] (lea fakafutuna) ===== * A E F G H I K L M N O P S T U V ʻ *article ''le'' *frequent words: ko, le, ki, mei, i, o, mo, faka- *distinguish from Wallisian: S is more frequent than H (''tasi'') *distinguish from Samoan: letter K *French loanwords ===[[Turkic languages]]=== Note that some Turkic languages like [[Azerbaijani language|Azeri]] and [[Turkmen language|Turkmen]] use a similar [[Latin alphabet]] (often [[Jaŋalif]]) and similar words, and might be confused with Turkish. Azeri has the letters Əə, Xx and Qq not present in the Turkish alphabet, and Türkmen has Ää, Žž, Ňň, Ýý and Ww. Latin Characters uniquely (or nearly uniquely) used for Turkic languages: Əə, Ŋŋ, Ɵɵ, Ьь, Ƣƣ, Ğğ, İ, and ı. All Turkic languages can form long words by adding multiple suffixes. ====[[Turkish language|Turkish]] ([[:tr:Türkçe|Türkçe/Türkiye_Türkçesi]])==== =====Turkish Alphabet===== Lowercase: a b c ç d e f g ğ h ı i j k l m n o ö p r s ş t u ü v y z Uppercase: A B C Ç D E F G Ğ H I İ J K L M N O Ö P R S Ş T U Ü V Y Z =====Common words===== *''bir'' — one, a *''bu'' — this *''ancak'' — but *''oldu'' — was (happened) *''şu'' — that =====Misc.===== *The letter "j" is only used in loanwords. *Words never begin with "ğ" *Look for common word endings. Tense changes in Turkish verbs are created by adding suffixes to the end of the verb. Pluralizations occur by adding ''-lar'' and ''-ler''. **Common Tense Changes: ''-yor'' ''-mış'' ''-muş'' ''-sun'' **Possessivity/person: ''-im'' ''-un'' ''-ın'' ''-in'' ''-iz'' ''-dur'' ''-tır'' **Example: ''Yap'''tı ''''', "[He] did it"; ''Yap'' is the verb stem meaning "to do", ''-mış'' indicates the perfect tense, ''-tır'' indicates the third person (he/she/it). **Example: [[Adalar|''Ada'''lar''''']], "Islands"; ''Ada'' is a noun meaning "island", ''-lar'' makes it plural.) **Example: ''Ev'''imiz''''', "Our house"; ''Ev'' is a noun meaning "house", ''-im'' indicates the first-person possessor, which ''-iz'' then makes plural.) ====[[Azerbaijani language|Azeri]] ([[:az:Azərbaycan_dili|Azərbaycanca]])==== Azeri can be easily recognized by the frequent use of ''[[ə]]''. This letter is not used in any other officially recognized modern Latin alphabet. In addition, it uses the letters ''x'' and ''q'', which are not used in Turkish. *Common words: ''və'', ''ki'', ''ilə'', ''bu'', ''o'', ''isə'', ''görə'', ''da'', ''də'' *Frequent use of diacritics: ''ç'', ''ğ'', ''ı'', ''İ'', ''ö'', ''ş'', ''ü'' *Words ending in ''-lar'', ''-lər'', ''-ın'', ''-in'', ''-da'', ''-də'', ''-dan'', ''-dən'' *Words never beginning with ''ğ'' or ''ı'' *Words rarely beginning with two or more consonants *Transliteration of foreign words and names, e.g. ''Audrey Hepburn'' = ''Odri Hepbern''
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)