Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Lexicostatistics
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Method== ===Create word list=== The aim is to generate a list of universally used meanings (hand, mouth, sky, I). Words are then collected for these meaning slots for each language being considered. Swadesh reduced a larger set of meanings down to 200 originally. He later found that it was necessary to reduce it further but that he could include some meanings that were not in his original list, giving his later 100-item list. The [[Swadesh list]] in [[Wiktionary]] gives the total 207 meanings in a number of languages. Alternative lists that apply more rigorous criteria have been generated, e.g. the [[Dolgopolsky list]] and the [[Leipzig–Jakarta list]], as well as lists with a more specific scope; for example, [[Isidore Dyen|Dyen]], [[Joseph Kruskal|Kruskal]] and Black have 200 meanings for 84 [[Indo-European languages]] in digital form.<ref name=Dyen&al1992>{{cite journal |last1=Dyen |first1=Isidore |last2=Kruskal |first2=Joseph |last3=Black |first3=Paul |title=An Indoeuropean Classification, a Lexicostatistical Experiment |journal=Transactions of the American Philosophical Society |date=1992 |volume=82 |issue=5|pages=iii–132 |doi=10.2307/1006517 |jstor=1006517 }}</ref> ===Determine cognacies=== A trained and experienced linguist is needed to make cognacy decisions. However, the decisions may need to be refined as the state of knowledge increases. However, lexicostatistics does not rely on all the decisions being correct. For each pair of words (in different languages) in this list, the cognacy of a form could be positive, negative or indeterminate. Sometimes a language has multiple words for one meaning, e.g. ''small'' and ''little'' for ''not big''. ===Calculate lexicostatistic percentages=== This percentage is related to the proportion of meanings for a particular language pair that are cognate, i.e. relative to the total without indeterminacy. This value is entered into an [[distance matrix|{{math|''N''×''N''}} table of distances]], where N is the number of languages being compared. When completed, this table is half-filled in [[triangular matrix|triangular]] form. The higher the proportion of cognacy the closer the languages are related. ===Create family tree=== Creation of the language tree is based solely on the table found above. Various sub-grouping methods can be used but that adopted by Dyen, Kruskal and Black was: * all lists are placed in a [[Pool (computer science)|pool]] * the two closest members are removed and form a nucleus which is placed in the pool * this step is repeated * under certain conditions a nucleus becomes a group * this is repeated until the pool only contains one group. Calculations have to be of nucleus and group lexical percentages.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)