Dotted and dotless I in computing

Template:Notability Template:Lead too short

Error when displaying dotted İ as a dotless I while translating from Turkish to Polish

The Latin-derived letters dotted İ i and dotless I ı, which are distinct letters in the alphabets of a number of Turkic languages, unlike in English and most languages using the Latin script, have caused some issues in computing.

DifficultiesEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}}

Unicode does not encode the uppercase form of dotless I and lowercase form of dotted İ separately from their base letters, and instead merges them with the upper and lower case forms of the Latin letter I respectively. John Cowan proposed disunification of plain Ii as capital letter dotless I and small letter I with dot above to make the casing more consistent.<ref>Template:Cite mailing list</ref> The Unicode Technical Committee had previously rejected a similar proposal<ref>Template:Cite mailing list</ref> because it would corrupt mapping from character sets with dotted and dotless I and corrupt data in these languages.Template:Citation needed

Most Unicode software uppercases ı to I, but, unless specifically configured for Turkish, it lowercases I to i. Thus uppercasing then lowercasing changes the letters. Likewise, most Unicode software uppercases i to I, changing the letter in the process.

In the Microsoft Windows SDK, beginning with Windows Vista, several relevant functions have a NORM_LINGUISTIC_CASING flag, to indicate that for Turkish and Azerbaijani locales, I should map to ı.

In the LaTeX typesetting language the dotless ı can be written with the backslash-i command: \i.

Dotted İ and dotless ı are problematic in the Turkish locales of several software packages, including Oracle DBMS, PHP, Java (software platform),<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref><ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> and Unixware 7, where implicit capitalization of names of keywords, variables, and tables has effects not foreseen by the application developers. The C or US English locales do not have these problems. The .NET Framework has special provisions to handle the 'Turkish iTemplate:'.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Many cellphones available in Turkey (as of 2008) lacked a proper localization, which led to replacing ı by i in SMS, sometimes severely distorting the sense of a text. In one instance, a miscommunication played a role in the deaths of Emine and Ramazan Çalçoban in 2008.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }} The use of "i" resulted in an SMS with a completely twisted meaning: instead of writing the word "sıkışınca" it looked like he wrote "sikişince". Ramazan wanted to write "You change the topic every time you run out of arguments" (sounds familiar enough) but what Emine read was, "You change the topic every time they are fucking you" (sounds familiar too.)</ref><ref>Template:Cite news</ref> A common substitution is to use the character 1 for dotless ı. This is also common in Azerbaijan (see also translit), but the meaning of words is generally understood.

In some Ectaco translators, the letter İ was also treated as I (e.g. TRAFIK Template:Angle bracket, when it is normally TRAFİK). Template:Charmap

ReferencesEdit

Template:Reflist

External linksEdit

Tex Texin, Internationalization for Turkish: Dotted and Dotless Letter "I", accessed 15 Nov 2005
The Turkish İ Problem and Why You Should Care | You've Been Haacked

Dotted and dotless I in computing

Contents

DifficultiesEdit

See alsoEdit

ReferencesEdit

External linksEdit