Tocharian languages

Template:Short description Template:Infobox language family

{{#invoke:sidebar|collapsible |pretitle = Part of a series on |titlestyle = padding-top:0.2em;background:rgb(220,245,220); |title = Indo-European topics |image = File:Indo-European migrations.gif |listtitlestyle = background:rgb(220,245,220);padding-left:0.4em;text-align:left; |listclass = hlist |expanded =

|list1name = Languages |list1title = Languages

|list1 =

List of Indo-European languages

Extant

Extinct

Reconstructed

Proto-Indo-European language
- Phonology: Sound laws, Accent, Ablaut

Hypothetical

Grammar

Other

|list2name = Philology |list2title = Philology |list2=

|list3name = Origins |list3title = Origins |list3=

Mainstream

Alternative and fringe

|list4name = Archaeology |list4title = Archaeology |list4 = Chalcolithic (Copper Age)
Pontic Steppe

Caucasus

Maykop

East Asia

Afanasievo

Eastern Europe

Northern Europe

Corded ware
- Template:Small
- Template:Small

Bronze Age
Pontic Steppe

Northern/Eastern Steppe

Europe

South Asia

Iron Age
Steppe

Chernoles

Europe

Caucasus

Colchian

Central Asia

Yaz

India

|list7name = Peoples and societies |list7title = Peoples and societies |list7= Bronze Age

Iron Age Indo-Aryans

Indo-Aryans

Iranians

Iranians

Nuristanis

Nuristanis

East Asia

Wusun
Yuezhi

Europe

Middle Ages
East Asia

Tocharians

Europe

Indo-Aryan

Medieval India

Iranian

Greater Iran

|list8name = Religion and mythology |list8title = Religion and mythology |list8 = Reconstructed

Historical

Hittite

Indo-Aryan

Iranian

Others

Armenian

European

Paleo-Balkan (Albanian · Illyrian · Thracian · Dacian)
Greek
Roman
Celtic
Germanic
Baltic
- Template:Small
- Template:Small
Slavic

Practices

|list9name = Academic research |list9title = Indo-European studies |list9 = Scholars

Institutes

Copenhagen Studies in Indo-European

Publications

| below = Template:Icon Category

}}

The Tocharian (sometimes Tokharian) languages (Template:IPAc-en Template:Respell;<ref>Template:Citation</ref> Template:IPAc-en Template:Respell),<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> also known as the Arśi-Kuči, Agnean-Kuchean or Kuchean-Agnean languages, are an extinct branch of the Indo-European language family spoken by inhabitants of the Tarim Basin, the Tocharians.<ref name="DD">Template:Cite book</ref> The languages are known from manuscripts dating from the 5th to the 8th century AD, which were found in oasis cities on the northern edge of the Tarim Basin (now part of Xinjiang in Northwest China) and the Lop Desert. The discovery of these languages in the early 20th century contradicted the formerly prevalent idea of an east–west division of the Indo-European language family as centum and satem languages, and prompted reinvigorated study of the Indo-European family. Scholars studying these manuscripts in the early 20th century identified their authors with the Tokharoi, a name used in ancient sources for people of Bactria (Tokharistan). Although this identification is now believed to be mistaken, "Tocharian" remains the usual term for these languages.<ref>Template:Cite journal</ref><ref name="DD" />

The discovered manuscripts record two closely related languages, called Tocharian A (also East Tocharian or Turfanian) and Tocharian B (West Tocharian or Kuchean).<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref><ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> The subject matter of the texts suggests that Tocharian A was more archaic and used as a Buddhist liturgical language, while Tocharian B was more actively spoken in the entire area from Turfan in the east to Tumshuq in the west. A body of loanwords and names found in Prakrit documents from the Lop Nur basin have been dubbed Tocharian C (Kroränian). A claimed find of ten Tocharian C texts written in Kharosthi has been discredited.<ref name="adams-tocharian-c-again">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

The oldest extant manuscripts in Tocharian B are now dated to the fifth or even late fourth century AD, making it a language of late antiquity contemporary with Gothic, Classical Armenian, and Primitive Irish.<ref>Template:Cite book</ref>

Discovery and significanceEdit

Template:Annotated image

File:IE1500BP.png

The geographical spread of Indo-European languages

The existence of the Tocharian languages and alphabet was not even suspected until archaeological exploration of the Tarim Basin by Aurel Stein in the early 20th century brought to light fragments of manuscripts in an unknown language, dating from the 6th to 8th centuries AD.<ref>Template:Cite book</ref>

It soon became clear that these fragments were actually written in two distinct but related languages belonging to a hitherto unknown branch of Indo-European, now known as Tocharian:

Tocharian A (Turfanian, Agnean, or East Tocharian; natively {{#invoke:Lang|lang}}) of Qarašähär (ancient Agni, Chinese Yanqi and Sanskrit Agni) and Turpan (ancient Turfan and Xočo), and
Tocharian B (Kuchean or West Tocharian) of Kucha and Tocharian A sites.

Prakrit documents from 3rd-century Krorän and Niya on the southeast edge of the Tarim Basin contain loanwords and names that appeared to scholars to come from a closely related language, referred to as Tocharian C.<ref name="mallory-expedition">Template:Cite periodical</ref> However, this was found to be entirely flawed for the Krorän part (see below, section "Tocharian C"). Recently, a dissertation authored by Niels Schoubben (Leiden University) has demonstrated that all the so-called Tocharian loanwords in Niya Prakrit were, in fact, Bactrian and pre-Bactrian loanwords, or resulted from fundamental misunderstandings of specific words and orthographies. His work definitively put an end to the "Tocharian C" hypothesis.<ref>Schoubben, Niels. 2024. Traces of language contact in Niya Prakrit Bactrian and other foreign elements. Leiden University: PhD Dissertation</ref>

The discovery of Tocharian upset some theories about the relations of Indo-European languages and revitalized their study. In the 19th century, it was thought that the division between centum and satem languages was a simple west–east division, with centum languages in the west. The theory was undermined in the early 20th century by the discovery of Hittite, a centum language in a relatively eastern location, and Tocharian, which was a centum language despite being the easternmost branch. The result was a new hypothesis, following the wave model of Johannes Schmidt, suggesting that the satem isogloss represents a linguistic innovation in the central part of the Proto-Indo-European home range, and the centum languages along the eastern and the western peripheries did not undergo that change.Template:Sfnp

Most scholars identify the ancestors of the Tocharians with the Afanasievo culture of South Siberia (Template:Circa 3300–2500 BC), an early eastern offshoot of the steppe cultures of the Don-Volga area that later became the Yamnayans.<ref name="Anthony2010, p=264–265, 308">Template:Cite book</ref>Template:Sfn<ref>Template:Cite journal</ref> Under this scenario, Tocharian speakers would have immigrated to the Tarim Basin from the north at some later point.

Most scholars reject Walter Bruno Henning's proposed link to Gutian, a language spoken on the Iranian plateau in the 22nd century BC and known only from personal names.Template:Sfnp

Tocharian probably died out after 840 when the Uyghurs, expelled from Mongolia by the Kyrgyz, moved into the Tarim Basin.<ref name="mallory-expedition" /> The theory is supported by the discovery of translations of Tocharian texts into Uyghur.

Some modern Chinese words may ultimately derive from a Tocharian or related source, e.g. Old Chinese {{#invoke:Lang|lang}} (Template:Zh) "honey", from Proto-Tocharian *ḿət(ə) (where *ḿ is palatalized; cf. Tocharian B {{#invoke:Lang|lang}}), cognate with Old Church Slavonic {{#invoke:Lang|lang}} (transliterated: Template:Transliteration) (meaning "honey"), and English Template:Linktext.<ref>Template:Harvp; Template:Harvp; Template:Harvp; Template:Harvp; Proto-Tocharian and Tocharian B forms from Template:Harvp.</ref>

NamesEdit

File:Royal family, Cave 17, Kizil (family detail, retouched), Hermitage Museum.jpg

CitationClass=web }}</ref><ref>Image 16 in Template:Cite book</ref><ref name="RG">"The images of donors in Cave 17 are seen in two fragments with numbers MIK 8875 and MIK 8876. One of them with halo may be identified as king of Kucha." in Template:Cite book "The panel of Tocharian donors and Buddhist monks, which was at the MIK (MIK 8875) disappeared during World War II and was discovered by Yaldiz in 2002 in the Hermitage Museum" page 65, note 30</ref><ref name="AVLC68">Template:Cite book</ref>

A colophon to a Central Asian Buddhist manuscript from the late 8th century states that it was translated into Old Turkic from Sanskrit, via a twγry language. In 1907, Emil Sieg and Friedrich W. K. Müller proposed that twγry was a name for the newly-discovered language of the Turpan area.Template:Sfnp Sieg and Müller, reading this name as toxrï, connected it with the ethnonym Tócharoi (Template:Langx, Ptolemy VI, 11, 6, 2nd century AD), itself taken from Indo-Iranian (cf. Old Persian tuxāri-, Khotanese ttahvāra, and Sanskrit tukhāra), and proposed the name "Tocharian" (German Tocharisch). Ptolemy's Tócharoi are often associated by modern scholars with the Yuezhi of Chinese historical accounts, who founded the Kushan Empire.Template:Sfnp Template:Sfnp It is now clear that these people actually spoke Bactrian, an Eastern Iranian language, rather than the language of the Tarim manuscripts, so the term "Tocharian" is considered a misnomer.<ref>Template:Cite book</ref><ref>Template:Harvp "In fact, we know that the Yuezhi used Bactrian, an Iranian language written in Greek characters, as an official language. For this reason, Tocharian is a misnomer; no extant evidence suggests that the residents of the Tocharistan region of Afghanistan spoke the Tocharian language recorded in the documents found in the Kucha region."</ref><ref>Template:Harvp: "At the same time we can now finally dispose of the name 'Tokharian'. This misnomer has been supported by three reasons, all of them now discredited."</ref> Nevertheless, it remains the standard term for the language of the Tarim Basin manuscripts.<ref name="Tocharian Online">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref><ref>Template:Cite book</ref>

In 1938, Walter Bruno Henning found the term "four twγry" used in early 9th-century manuscripts in Sogdian, Middle Iranian, and Uighur. He argued that it referred to the region on the northeast edge of the Tarim, including Agni and Karakhoja, but not Kucha. He thus inferred that the colophon referred to the Agnean language.Template:Sfnp Template:Sfnp

Although the term twγry or toxrï appears to be the Old Turkic name for the Tocharians, it is not found in Tocharian texts.<ref name="Tocharian Online" /> The apparent self-designation ārśi appears in Tocharian A texts. Tocharian B texts use the adjective kuśiññe, derived from kuśi or kuči, a name also known from Chinese and Turkic documents.<ref name="Tocharian Online" /> The historian Bernard Sergent compounded these names to coin an alternative term Arśi-Kuči for the family, recently revised to Agni-Kuči,<ref>Template:Cite book</ref> but this name has not achieved widespread usage.

Writing systemEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}}

File:Se pañäkte saṅketavattse ṣarsa papaiykau.jpg

style }}

Tocharian is documented in manuscript fragments, mostly from the 8th century (with a few earlier ones) that were written on palm leaves, wooden tablets, and Chinese paper, preserved by the extremely dry climate of the Tarim Basin. Samples of the language have been discovered at sites in Kucha and Karasahr, including many mural inscriptions.

Most of attested Tocharian was written in the Tocharian alphabet, a derivative of the Brahmi alphabetic syllabary (abugida) also referred to as North Turkestan Brahmi or slanting Brahmi. However a smaller amount was written in the Manichaean script in which Manichaean texts were recorded.Template:Sfnp Template:Sfnp It soon became apparent that a large proportion of the manuscripts were translations of known Buddhist works in Sanskrit and some of them were even bilingual, facilitating decipherment of the new language. Besides the Buddhist and Manichaean religious texts, there were also monastery correspondence and accounts, commercial documents, caravan permits, medical and magical texts, and one love poem.

In 1998, the Chinese linguist Ji Xianlin published a translation and analysis of fragments of a Tocharian Maitreyasamiti-Nataka discovered in 1974 in Yanqi.<ref>"Fragments of the Tocharian", Andrew Leonard, How the World Works, Salon.com, January 29, 2008. Template:Webarchive</ref><ref>Template:Cite journal</ref><ref>Template:Cite book</ref>

Tocharian A and BEdit

File:Tocharian languages.svg

Tocharian languages A (blue), B (red) and C (green) in the Tarim Basin.Template:Sfnp Tarim oasis towns are given as listed in the Book of Han (Template:Circa 2nd century BC), with the areas of the squares proportional to population.Template:Sfnp

Tocharian A and B are significantly different, to the point of being mutually unintelligible. A common Proto-Tocharian language must precede the attested languages by several centuries, probably dating to the late 1st millennium BC.<ref>Template:Cite book</ref>

Tocharian A is found only in the eastern part of the Tocharian-speaking area, and all extant texts are of a religious nature. Tocharian B, however, is found throughout the range and in both religious and secular texts. As a result, it has been suggested that Tocharian A was a liturgical language, no longer spoken natively, while Tocharian B was the spoken language of the entire area.<ref name="mallory-expedition" />

The hypothesized relationship of Tocharian A and B as liturgical and spoken forms, respectively, is sometimes compared with the relationship between Latin and the modern Romance languages, or Classical Chinese and Mandarin. However, in both of these latter cases, the liturgical language is the linguistic ancestor of the spoken language, whereas no such relationship holds between Tocharian A and B. In fact, from a phonological perspective Tocharian B is significantly more conservative than Tocharian A, and serves as the primary source for reconstructing Proto-Tocharian. Only Tocharian B preserves the following Proto-Tocharian features: stress distinctions, final vowels, diphthongs, and o vs. e distinction. In turn, the loss of final vowels in Tocharian A has led to the loss of certain Proto-Tocharian categories still found in Tocharian B, e.g. the vocative case and some of the noun, verb, and adjective declensional classes.

In their declensional and conjugational endings, the two languages innovated in divergent ways, with neither clearly simpler than the other. For example, both languages show significant innovations in the present active indicative endings but in radically different ways, so that only the second-person singular ending is directly cognate between the two languages, and in most cases neither variant is directly cognate with the corresponding Proto-Indo-European (PIE) form. The agglutinative secondary case endings in the two languages likewise stem from different sources, showing parallel development of the secondary case system after the Proto-Tocharian period. Likewise, some of the verb classes show independent origins, e.g. the class II preterite, which uses reduplication in Tocharian A (possibly from the reduplicated aorist) but long PIE ē in Tocharian B (possibly related to the long-vowel perfect found in Latin lēgī, fēcī, etc.).<ref name="Tocharian Online" />

Tocharian B shows an internal chronological development; three linguistic stages have been detected.Template:Sfnp The oldest stage is attested only in Kucha. There are also the middle ("classical") and the late stage.<ref>Template:Cite encyclopedia</ref>

Tocharian CEdit

A third Tocharian language was first suggested by Thomas Burrow in the 1930s, while discussing 3rd-century documents from Krorän (Loulan) and Niya. The texts were written in Gandhari Prakrit, but contained loanwords of evidently Tocharian origin, such as kilme ('district'), ṣoṣthaṃga ('tax collector'), and ṣilpoga ('document'). This hypothetical language later became generally known as Tocharian C. It has also sometimes been called Kroränian or Krorainic.<ref>Template:Cite journal</ref>

In papers published posthumously in 2018, Klaus T. Schmidt, a scholar of Tocharian, presented a decipherment of 10 texts written in the Kharoṣṭhī script. Schmidt claimed that these texts were written in a third Tocharian language he called {{#invoke:Lang|lang}}.<ref>Template:Cite book</ref><ref name=":0">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> He also suggested that the language was closer to Tocharian B than to Tocharian A.<ref name=":0" /> In 2019 a group of linguists led by Georges-Jean Pinault and Michaël Peyrot convened in Leiden to examine Schmidt's translations against the original texts. They concluded that Schmidt's decipherment was fundamentally flawed, that there was no reason to associate the texts with Krörän, and that the language they recorded was neither Tocharian nor Indic, but Iranian.<ref name="adams-tocharian-c-again" /><ref>Template:Cite journal</ref>

In 2024, Schoubben conducted a systematic review of Niya Prakrit and the loanwords claimed as evidence for Tocharian C. He argued that most of the words in question could be explained as loanwords from Bactrian or other Iranian languages, and found no compelling evidence for a Tocharian substrate.<ref>Template:Cite thesis</ref> For example, Burrow proposed that aṃklatsa, 'a type of camel', corresponded to Tocharian A āknats and Tocharian B aknātsa 'stupid, foolish', believing that this would refer to an 'untrained camel', from a Tocharian form *anknats (with the negative prefix *en-). Not only does this etymology presuppose an ad hoc sound change from *-nkn- to *-nkl-, but the variant agiltsa also found in Niya becomes aberrant. Schoubben suggests that this is might be a Bactrian word, as camels originally come from Bactria, but could not find a convincing etymology.<ref>Schoubben 2024: 430</ref> He had earlier argued that <ḱ> was used in Niya Prakrit to transcribe Bactrian -šk- (spelled ϸκ in the Bactrian alphabet). For example, Burrow had tentatively connected maḱa, a Niya Prakrit word for an unidentified food produced on farms, with Tocharian A malke 'milk', but Schoubben derives it from Proto-Iranian *māšaka- 'bean'.<ref>Template:Cite journal</ref>

PhonologyEdit

Template:Multiple image Phonetically, Tocharian languages are "centum" Indo-European languages, meaning that they merge the palatovelar consonants Template:PIE of Proto-Indo-European with the plain velars (*k, *g, *gʰ) rather than palatalizing them to affricates or sibilants. Centum languages are mostly found in western and southern Europe (Greek, Italic, Celtic, Germanic). In that sense Tocharian (to some extent like the Greek and the Anatolian languages) seems to have been an isolate in the "satem" (i.e. palatovelar to sibilant) phonetic regions of Indo-European-speaking populations. The discovery of Tocharian contributed to doubts that Proto-Indo-European had originally split into western and eastern branches; today, the centum–satem division is not seen as a real familial division.Template:Sfnp<ref name="baldi">Template:Cite book</ref>

VowelsEdit

	Front	Central	Back
Close	main}}	main}}	main}}
Mid	main}}	main}}	main}}
Open		main}}

Tocharian A and Tocharian B have the same set of vowels, but they often do not correspond to each other. For example, the sound a did not occur in Proto-Tocharian. Tocharian B a is derived from former stressed ä or unstressed ā (reflected unchanged in Tocharian A), while Tocharian A a stems from Proto-Tocharian {{#invoke:IPA|main}} or {{#invoke:IPA|main}} (reflected as {{#invoke:IPA|main}} and {{#invoke:IPA|main}} in Tocharian B), and Tocharian A e and o stem largely from monophthongization of former diphthongs (still present in Tocharian B).

DiphthongsEdit

Diphthongs occur in Tocharian B only.

	Closer component is front	Closer component is back
Opener component is unrounded	main}}	main}} āu {{#invoke:IPA\|main}}
Opener component is rounded	main}}

ConsonantsEdit

File:Tocharian.JPG

Wooden tablet with an inscription showing Tocharian B in its Brahmic form. Kucha, Xinjiang, 5th–8th century (Tokyo National Museum)

The following table lists the reconstructed phonemes in Tocharian along with their standard transcription. Because Tocharian is written in an alphabet used originally for Sanskrit and its descendants, the transcription reflects Sanskrit phonology, and may not represent Tocharian phonology accurately. The Tocharian alphabet also has letters representing all of the remaining Sanskrit sounds, but these appear only in Sanskrit loanwords and are not thought to have had distinct pronunciations in Tocharian. There is some uncertainty as to actual pronunciation of some of the letters, particularly those representing palatalized obstruents (see below).

	Bilabial	Alveolar	Alveolo-palatal	Palatal	Velar
Plosive	main}}	main}}			main}}
Affricate		main}}	main}}?²
Fricative		main}}	main}}	main}}?³
Nasal	main}}	n Template:IAST {{#invoke:IPA\|main}}¹		main}}	main}}⁴
Trill		main}}
Approximant				main}}	main}}
Lateral approximant		main}}		main}}

{{#invoke:IPA|main}} is transcribed by two different letters in the Tocharian alphabet depending on position. Based on the corresponding letters in Sanskrit, these are transcribed Template:IAST (word-finally, including before certain clitics) and n (elsewhere), but Template:IAST represents {{#invoke:IPA|main}}, not {{#invoke:IPA|main}}.
The sound written Template:IAST is thought to correspond to a alveolo-palatal affricate Template:IPAslink in Sanskrit. The Tocharian pronunciation {{#invoke:IPA|main}} is suggested by the common occurrence of the cluster śc, but the exact pronunciation cannot be determined with certainty.
The sound written Template:IAST seems more likely to have been a palato-alveolar sibilant Template:IPAslink (as in English "ship"), because it derives from a palatalized Template:IPAslink.<ref name="ringe-proto-tocharian">Ringe, Donald A. (1996). On the Chronology of Sound Changes in Tocharian: Volume I: From Proto-Indo-European to Proto-Tocharian. New Haven, CT: American Oriental Society.</ref>
The sound ṅ {{#invoke:IPA|main}} occurs only before k, or in some clusters where a k has been deleted between consonants. It is clearly phonemic because sequences nk and ñk also exist (from syncope of a former ä between them).

MorphologyEdit

NounsEdit

Tocharian has completely re-worked the nominal declension system of Proto-Indo-European.Template:Sfnp The only cases inherited from the proto-language are nominative, genitive, accusative, and (in Tocharian B only) vocative; in Tocharian the old accusative is known as the oblique case. In addition to these primary cases, however, each Tocharian language has six cases formed by the addition of an invariant suffix to the oblique case — although the set of six cases is not the same in each language, and the suffixes are largely non-cognate. For example, the Tocharian word Template:IAST (Toch B), Template:IAST (Toch A) "horse" < PIE *eḱwos is declined as follows:<ref name="Tocharian Online" />

Case	Tocharian B			Tocharian A
Case	Suffix	Singular	Plural	Suffix	Singular	Plural
Nominative	—	Template:IAST	Template:IAST	—	Template:IAST	Template:IAST
Vocative	—	Template:IAST	—	—	—	—
Genitive	—	Template:IAST	Template:IAST	—	Template:IAST	Template:IAST
Oblique	—	Template:IAST	Template:IAST	—	Template:IAST	Template:IAST
Instrumental	—	—	—	-yo	Template:IAST	Template:IAST
Perlative	-sa	Template:IAST	Template:IAST	-ā	Template:IAST	Template:IAST
Comitative	-mpa	Template:IAST	Template:IAST	-aśśäl	Template:IAST	Template:IAST
Allative	-ś(c)	Template:IAST	Template:IAST	-ac	Template:IAST	Template:IAST
Ablative	Template:IAST	Template:IAST	Template:IAST	Template:IAST	Template:IAST	Template:IAST
Locative	Template:IAST	Template:IAST	Template:IAST	Template:IAST	Template:IAST	Template:IAST
Causative	Template:IAST	Template:IAST	Template:IAST	—	—	—

The Tocharian A instrumental case rarely occurs with humans.

When referring to humans, the oblique singular of most adjectives and of some nouns is marked in both varieties by an ending -(a)ṃ, which also appears in the secondary cases. An example is Template:IAST (Toch B), Template:IAST (Toch A) "man", which belongs to the same declension as above, but has oblique singular Template:IAST (Toch B), Template:IAST (Toch A), and corresponding oblique stems Template:IAST (Toch B), Template:IAST (Toch A) for the secondary cases. This is thought to stem from the generalization of n-stem adjectives as an indication of determinative semantics, seen most prominently in the weak adjective declension in the Germanic languages (where it cooccurs with definite articles and determiners), but also in Latin and Greek n-stem nouns (especially proper names) formed from adjectives, e.g. Latin Catō (genitive Catōnis) literally "the sly one" < catus "sly",<ref>Template:Cite dictionary</ref><ref>Template:Cite dictionary</ref> Greek Plátōn literally "the broad-shouldered one" < platús "broad".<ref name="Tocharian Online" />

VerbsEdit

File:龜茲國 Qiuci Kucha in Wanghuitu 王会图, circa 650 CE.jpg

lang}}), circa 650 AD

In contrast, the verbal conjugation system is quite conservative.Template:Sfnp The majority of Proto-Indo-European verbal classes and categories are represented in some manner in Tocharian, although not necessarily with the same function.<ref>Douglas Q. Adams, "On the Development of the Tocharian Verbal System", Journal of the American Oriental Society, Vol. 98, No. 3 (Jul. – Sep., 1978), pp. 277–288.</ref> Some examples: athematic and thematic present tenses, including null-, -y-, -sḱ-, -s-, -n- and -nH- suffixes as well as n-infixes and various laryngeal-ending stems; o-grade and possibly lengthened-grade perfects (although lacking reduplication or augment); sigmatic, reduplicated, thematic, and possibly lengthened-grade aorists; optatives; imperatives; and possibly PIE subjunctives.

In addition, most PIE sets of endings are found in some form in Tocharian (although with significant innovations), including thematic and athematic endings, primary (non-past) and secondary (past) endings, active and mediopassive endings, and perfect endings. Dual endings are still found, although they are rarely attested and generally restricted to the third person. The mediopassive still reflects the distinction between primary -r and secondary -i, effaced in most Indo-European languages. Both root and suffix ablaut is still well-represented, although again with significant innovations.

CategoriesEdit

Tocharian verbs are conjugated in the following categories:<ref name="Tocharian Online" />

Mood: indicative, subjunctive, optative, imperative.
Tense/aspect (in the indicative only): present, preterite, imperfect.
Voice: active, mediopassive, deponent.
Person: 1st, 2nd, 3rd.
Number: singular, dual, plural.
Causation: basic, causative.
Non-finite: active participle, mediopassive participle, present gerundive, subjunctive gerundive.

ClassesEdit

A given verb belongs to one of a large number of classes, according to its conjugation. As in Sanskrit, Ancient Greek, and (to a lesser extent) Latin, there are independent sets of classes in the indicative present, subjunctive, perfect, imperative, and to a limited extent optative and imperfect, and there is no general correspondence among the different sets of classes, meaning that each verb must be specified using a number of principal parts.

Present indicativeEdit

The most complex system is the present indicative, consisting of 12 classes, 8 thematic and 4 athematic, with distinct sets of thematic and athematic endings. The following classes occur in Tocharian B (some are missing in Tocharian A):

I: Athematic without suffix < PIE root athematic.
II: Thematic without suffix < PIE root thematic.
III: Thematic with PToch suffix *-ë-. Mediopassive only. Apparently reflecting consistent PIE o theme rather than the normal alternating o/e theme.
IV: Thematic with PToch suffix *-ɔ-. Mediopassive only. Same PIE origin as previous class, but diverging within Proto-Tocharian.
V: Athematic with PToch suffix *-ā-, likely from either PIE verbs ending in a syllabic laryngeal or PIE derived verbs in *-eh₂- (but extended to other verbs).
VI: Athematic with PToch suffix *-nā-, from PIE verbs in *-nH-.
VII: Athematic with infixed nasal, from PIE infixed nasal verbs.
VIII: Thematic with suffix -s-, possibly from PIE -sḱ-?
IX: Thematic with suffix -sk- < PIE -sḱ-.
X: Thematic with PToch suffix *-näsk/nāsk- (evidently a combination of classes VI and IX).
XI: Thematic in PToch suffix *-säsk- (evidently a combination of classes VIII and IX).
XII: Thematic with PToch suffix *-(ä)ññ- < either PIE *-n-y- (denominative to n-stem nouns) or PIE *-nH-y- (deverbative from PIE *-nH- verbs).

Palatalization of the final root consonant occurs in the 2nd singular, 3rd singular, 3rd dual and 2nd plural in thematic classes II and VIII-XII as a result of the original PIE thematic vowel e.

SubjunctiveEdit

The subjunctive likewise has 12 classes, denoted i through xii. Most are conjugated identically to the corresponding indicative classes; indicative and subjunctive are distinguished by the fact that a verb in a given indicative class will usually belong to a different subjunctive class.

In addition, four subjunctive classes differ from the corresponding indicative classes, two "special subjunctive" classes with differing suffixes and two "varying subjunctive" classes with root ablaut reflecting the PIE perfect.

Special subjunctives:

iv: Thematic with suffix i < PIE -y-, with consistent palatalization of final root consonant. Tocharian B only, rare.
vii: Thematic (not athematic, as in indicative class VII) with suffix ñ < PIE -n- (palatalized by thematic e, with palatalized variant generalized).

Varying subjunctives:

i: Athematic without suffix, with root ablaut reflecting PIE o-grade in active singular, zero-grade elsewhere. Derived from PIE perfect.
v: Identical to class i but with PToch suffix *-ā-, originally reflecting laryngeal-final roots but generalized.

PreteriteEdit

The preterite has 6 classes:

I: The most common class, with a suffix ā < PIE Ḥ (i.e. roots ending in a laryngeal, although widely extended to other roots). This class shows root ablaut, with original e-grade (and palatalization of the initial root consonant) in the active singular, contrasting with zero-grade (and no palatalization) elsewhere.
II: This class has reduplication in Tocharian A (possibly reflecting the PIE reduplicated aorist). However, Tocharian B has a vowel reflecting long PIE ē, along with palatalization of the initial root consonant. There is no ablaut in this class.
III: This class has a suffix s in the 3rd singular active and throughout the mediopassive, evidently reflecting the PIE sigmatic aorist. Root ablaut occurs between active and mediopassive. A few verbs have palatalization in the active along with s in the 3rd singular, but no palatalization and no s in the mediopassive, along with no root ablaut (the vowel reflects PToch ë). This suggests that, for these verbs in particular, the active originates in the PIE sigmatic aorist (with s suffix and ē vocalism) while the mediopassive stems from the PIE perfect (with o vocalism).
IV: This class has suffix ṣṣā, with no ablaut. Most verbs in this class are causatives.
V: This class has suffix ñ(ñ)ā, with no ablaut. Only a few verbs belong to this class.
VI: This class, which has only two verbs, is derived from the PIE thematic aorist. As in Greek, this class has different endings from all the others, which partly reflect the PIE secondary endings (as expected for the thematic aorist).

All except preterite class VI have a common set of endings that stem from the PIE perfect endings, although with significant innovations.

ImperativeEdit

The imperative likewise shows 6 classes, with a unique set of endings, found only in the second person, and a prefix beginning with p-. This prefix usually reflects Proto-Tocharian *pä- but unexpected connecting vowels occasionally occur, and the prefix combines with vowel-initial and glide-initial roots in unexpected ways. The prefix is often compared with the Slavic perfective prefix po-, although the phonology is difficult to explain.

Classes i through v tend to co-occur with preterite classes I through V, although there are many exceptions. Class vi is not so much a coherent class as an "irregular" class with all verbs not fitting in other categories. The imperative classes tend to share the same suffix as the corresponding preterite (if any), but to have root vocalism that matches the vocalism of a verb's subjunctive. This includes the root ablaut of subjunctive classes i and v, which tend to co-occur with imperative class i.

Optative and imperfectEdit

The optative and imperfect have related formations. The optative is generally built by adding i onto the subjunctive stem. Tocharian B likewise forms the imperfect by adding i onto the present indicative stem, while Tocharian A has 4 separate imperfect formations: usually ā is added to the subjunctive stem, but occasionally to the indicative stem, and sometimes either ā or s is added directly onto the root. The endings differ between the two languages: Tocharian A uses present endings for the optative and preterite endings for the imperfect, while Tocharian B uses the same endings for both, which are a combination of preterite and unique endings (the latter used in the singular active).

EndingsEdit

As suggested by the above discussion, there are a large number of sets of endings. The present-tense endings come in both thematic and athematic variants, although they are related, with the thematic endings generally reflecting a theme vowel (PIE e or o) plus the athematic endings. There are different sets for the preterite classes I through V; preterite class VI; the imperative; and in Tocharian B, in the singular active of the optative and imperfect. Furthermore, each set of endings comes with both active and mediopassive forms. The mediopassive forms are quite conservative, directly reflecting the PIE variation between -r in the present and -i in the past. (Most other languages with the mediopassive have generalized one of the two.)

The present-tense endings are almost completely divergent between Tocharian A and B. The following shows the thematic endings, with their origin:

Thematic present active indicative endings
	Original PIE	Tocharian B		Tocharian A		Notes
	Original PIE	PIE source	Actual form	PIE source	Actual form	Notes
1st sing	*-o-h₂	*-o-h₂ + PToch -u	-āu	*-o-mi	-am	*-mi < PIE athematic present
2nd sing	*-e-si	*-e-th₂e?	-'t	*-e-th₂e	-'t	-th₂e < PIE perfect; previous consonant palatalized; Tocharian B form should be -'ta*
3rd sing	*-e-ti	*-e-nu	-'(ä)ṃ	*-e-se	-'ṣ	-nu < PIE nu "now"; previous consonant palatalized
1st pl	*-o-mos?	*-o-mō?	-em(o)	*-o-mes + V	-amäs
2nd pl	*-e-te	*-e-tē-r + V	-'cer	*-e-te	-'c	*-r < PIE mediopassive?; previous consonant palatalized
3rd pl	*-o-nti	*-o-nt	-eṃ	*-o-nti	-eñc < *-añc	*-o-nt < PIE secondary ending

Comparison to other Indo-European languagesEdit

Template:Cleanup-lang

English	Tocharian A	Tocharian B	Ancient Greek	Hittite	Sanskrit	Latin	Proto-Germanic	Gothic	Old Irish	Proto-Slavic	Armenian	Proto-Indo-European
Tocharian vocabulary (sample)
one	sas	ṣe	heîs, hen	ās	sa(kṛ́t)	semelTemplate:Efn	*simlaTemplate:Efn	simleTemplate:Efn	samailTemplate:Efn	*sǫ-Template:Efn	mi	Template:Wikt-lang > PToch *sems
two	wu	wi	dúo	dān	dvā́	duo	*twai	twái	dá	*dъva	erku	Template:Wikt-lang
three	tre	trai	treîs	tēries	tráyas	trēs	*þrīz	þreis	trí	*trьje	erekʻ	Template:Wikt-lang
four	śtwar	śtwer	téttares, téssares	meyawes	catvā́ras, catúras	quattuor	*fedwōr	fidwōr	cethair	*četỳre	čʻorkʻ	Template:Wikt-lang
five	päñ	piś	pénte	?	páñca	quīnque	*fimf	fimf	cóic	*pętь	hing	Template:Wikt-lang
six	ṣäk	ṣkas	héx	?	ṣáṣ	sex	*sehs	saihs	sé	*šestь	vecʻ	Template:Wikt-lang
seven	ṣpät	ṣukt	heptá	sipta	saptá	septem	*sebun	sibun	secht	*sedmь	eōtʻn	Template:Wikt-lang
eight	okät	okt	oktṓ	?	aṣṭáu, aṣṭá	octō	*ahtōu	ahtau	ocht	*osmь	utʻ	Template:Wikt-lang
nine	ñu	ñu	ennéa	?	náva	novem	*newun	niun	noí	*dȅvętь	inn	Template:Wikt-lang
ten	śäk	śak	déka	?	dáśa	decem	*tehun	taihun	deich	*dȅsętь	tasn	Template:Wikt-lang
hundred	känt	kante	hekatón	?	śatām	centum	*hundą	hund	cét	*sъto		Template:Wikt-lang
father	pācar	pācer	patḗr	atta	pitṛ	pater	*fadēr	fadar	athair	*patrTemplate:Efn		Template:Wikt-lang
mother	mācar	mācer	mḗtēr	anna	mātṛ	māter	*mōdēr	mōdar	máthair	*màti		Template:Wikt-lang
brother	pracar	procer	phrā́tērTemplate:Efn	negna/nekna	bhrātṛ	frāter	*brōþēr	brōþar	bráthair	*bràtrъ		Template:Wikt-lang
sister	ṣar	ṣer	éorTemplate:Efn	negah	svásṛ	soror	*swestēr	swistar	siur	*sestrà		Template:Wikt-lang
horse	yuk	yakwe	híppos	ekku	áśva-	equus	*ehwaz	lang}}	ech	(Balto-Slavic *áśwāˀ)		Template:Wikt-lang
cow	ko	keu	boûs	suppal / kuwāu	gaúṣ	bōsTemplate:Efn	*kūz	(OE cū)	bó	*govę̀do		Template:Wikt-lang
voiceTemplate:Efn	vak	vek	éposTemplate:Efn	?	vāk	vōx	*wōhmazTemplate:Efn	(Du gewag)Template:Efn	focculTemplate:Efn	*vikъTemplate:Efn		Template:Wikt-lang
name	ñom	ñem	ónoma	halzāi	nāman-	nōmen	*namô	namō	ainmm	*jь̏mę		Template:Wikt-lang
to milk	mālkā	mālkant	amélgein	?	–	mulgēre	*melkaną	(OE me(o)lcan)	bligid (MIr)	*melzti		Template:Wikt-lang

Template:Notelist

In traditional Indo-European studies, no hypothesis of a closer genealogical relationship of the Tocharian languages has been widely accepted by linguists. However, lexicostatistical and glottochronological approaches suggest the Anatolian languages, including Hittite, might be the closest relatives of Tocharian.<ref>Holm, Hans J. (2008). "The Distribution of Data in Word Lists and its Impact on the Subgrouping of Languages", In: Christine Preisach, Hans Burkhardt, Lars Schmidt-Thieme, Reinhold Decker (Editors): Data Analysis, Machine Learning, and Applications. Proc. of the 31st Annual Conference of the German Classification Society (GfKl), University of Freiburg, March 7–9, 2007. Springer-Verlag, Heidelberg-Berlin.</ref><ref>Václav Blažek (2007), "From August Schleicher to Sergej Starostin; On the development of the tree-diagram models of the Indo-European languages". Journal of Indo-European Studies 35 (1&2): 82–109.</ref><ref>Template:Cite journal</ref> As an example, the same Proto-Indo-European root Template:Wikt-lang (but not a common suffixed formation) can be reconstructed to underlie the words for 'wheel': Tocharian A wärkänt, Tokharian B yerkwanto, and Hittite ḫūrkis.

Contact with other languagesEdit

Michaël Peyrot argues that several of the most striking typological peculiarities of Tocharian are rooted in a prolonged contact of Proto-Tocharian with an early stage of Proto-Samoyedic in South Siberia. This might explain the merger of all three stop series (e.g. *t, *d, *dʰ > *t), which must have led to a huge number of homonyms, restructuring of the vowel system, development of agglutinative case marking, the loss of the dative case, and others.Template:Sfnp

In historic times, the Tocharian language stood in contact with various surrounding languages, including Iranian, Turkic, and Sinitic languages. Tocharian borrowings, and other Indo-European loanwords transmitted to Uralic, Turkic and Sinitic speakers, have been confirmed.Template:Sfnp Tocharian had a high social position within the region, and influenced the Turkic languages, which would later replace Tocharian in the Tarim Basin.Template:Sfnp

Notable exampleEdit

Most of the texts known from the Tocharians are religious, but one noted text is a fragment of a love poem in Tocharian B (manuscript B-496, found in Kizil):<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Tocharian B manuscript B-496
Translation (English)	Transliteration	Inscription (Tocharian script)
<templatestyles src="Template:Blockquote/styles.css" /> Template:Error Template:Main other{{#if:\|{{#if:\|}} — {{#if:\|, in }}Template:Comma separated entries }} {{#invoke:Check for unknown parameters\|check\|unknown=Template:Main other\|preview=Page using Template:Blockquote with unknown parameter "_VALUE_"\|ignoreblank=y\| 1 \| 2 \| 3 \| 4 \| 5 \| author \| by \| char \| character \| cite \| class \| content \| multiline \| personquoted \| publication \| quote \| quotesource \| quotetext \| sign \| source \| style \| text \| title \| ts }}	Template:Poemquote	File:Tocharian B Love Poem.jpg Tocharian B Love Poem, manuscript B496 (one of two fragments).

ReferencesEdit

CitationsEdit

Template:Reflist

SourcesEdit

Template:Refbegin

Template:Citation
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Carling, Gerd (2009). Dictionary and Thesaurus of Tocharian A. Volume 1: a-j. (in collaboration with Georges-Jean Pinault and Werner Winter), Wiesbaden, Harrassowitz Verlag, Template:ISBN.
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Template:Citation
Lévi, Sylvain (1913). "Tokharian Pratimoksa Fragment". The Journal of the Royal Asiatic Society of Great Britain and Ireland, pp. 109–120.
Template:Citation
Malzahn, Melanie (ed.) (2007). Instrumenta Tocharica. Heidelberg: Carl Winter Universitätsverlag, Template:ISBN.
Template:Citation
Template:Citation
Pinault, Georges-Jean (2008). Chrestomathie tokharienne: Textes et grammaire. Leuven-Paris: Peeters (Collection linguistique publiée par la Société de Linguistique de Paris, no. XCV), Template:ISBN.
Template:Citation
Ringe, Donald A. (1996). On the Chronology of Sound Changes in Tocharian: Volume I: From Proto-Indo-European to Proto-Tocharian. New Haven, CT: American Oriental Society.
Schmalsteig, William R. (1974). "Tokharian and Baltic Template:Webarchive." Lituanus. v. 20, no. 3.
Template:Citation
Winter, Werner (1998). "Tocharian." In Ramat, Giacalone Anna and Paolo Ramat (eds). The Indo-European languages, 154–168. London: Routledge, Template:ISBN.

Template:Refend

External linksEdit

Tocharian alphabet (from Omniglot)
Thesaurus Indogermanischer Text- und Sprachmaterialien (TITUS):
Mark Dickens, "Everything you always wanted to know about Tocharian"
Tocharian Online by Todd B. Krause and Jonathan Slocum, free online lessons at the Linguistics Research Center at the University of Texas at Austin
Online dictionary of Tocharian B, based upon D. Q. Adams's A Dictionary of Tocharian B (1999)
Tocharian B Swadesh list (From Wiktionary)
Comprehensive Edition of Tocharian Manuscripts, University of Vienna, with images, transcriptions and (in many cases) translations and other information.
Template:Cite book Transcriptions of Tocharian A manuscripts.
{{#invoke:citation/CS1|citation

|CitationClass=web }}

glottothèque – Ancient Indo-European Grammars online, an online collection of introductory videos to Ancient Indo-European languages produced by the University of Göttingen

Template:Indo-European languages Template:Languages of China Template:Authority control

Tocharian languages

Contents

Discovery and significanceEdit

NamesEdit

Writing systemEdit

Tocharian A and BEdit

Tocharian CEdit

PhonologyEdit

VowelsEdit

DiphthongsEdit

ConsonantsEdit

MorphologyEdit

NounsEdit

VerbsEdit

CategoriesEdit

ClassesEdit

Present indicativeEdit

SubjunctiveEdit

PreteriteEdit

ImperativeEdit

Optative and imperfectEdit

EndingsEdit

Comparison to other Indo-European languagesEdit

Contact with other languagesEdit

Notable exampleEdit

See alsoEdit

ReferencesEdit

CitationsEdit

SourcesEdit

Further readingEdit

External linksEdit