Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Computational linguistics
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Use of computational tools for the study of linguistics}} {{About|the scientific field|the journal|Computational Linguistics (journal)}} {{Linguistics|Subfields2}} '''Computational linguistics''' is an [[Interdisciplinarity|interdisciplinary]] field concerned with the [[computational modelling]] of [[natural language]], as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon [[linguistics]], [[computer science]], [[artificial intelligence]], [[mathematics]], [[logic]], [[philosophy]], [[cognitive science]], [[cognitive psychology]], [[psycholinguistics]], [[anthropology]] and [[neuroscience]], among others. Computational linguistics is closely related to [[mathematical linguistics]]. ==Origins== The field overlapped with [[artificial intelligence]] since the efforts in the United States in the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian scientific journals, into English.<ref>John Hutchins: [http://www.hutchinsweb.me.uk/MTS-1999.pdf Retrospect and prospect in computer-based translation.] {{Webarchive|url=https://web.archive.org/web/20080414141215/http://www.hutchinsweb.me.uk/MTS-1999.pdf |date=2008-04-14 }} Proceedings of MT Summit VII, 1999, pp. 30–44.</ref> Since rule-based approaches were able to make [[arithmetic]] (systematic) calculations much faster and more accurately than humans, it was expected that [[lexicon]], [[morphology (linguistics)|morphology]], [[syntax]] and [[semantics]] can be learned using explicit rules, as well. After the [[AI winter|failure of rule-based approaches]], [[David G. Hays|David Hays]]<ref>{{cite web|url=http://nlp.shef.ac.uk/iccl/committee.html#deceased|title=Deceased members|website=ICCL members|access-date=15 November 2017|ref=ICCLmembers|archive-date=17 May 2017|archive-url=https://web.archive.org/web/20170517235543/http://nlp.shef.ac.uk/iccl/committee.html#deceased|url-status=dead}}</ref> coined the term in order to distinguish the field from AI and co-founded both the [[Association for Computational Linguistics|Association for Computational Linguistics (ACL)]] and the [[International Committee on Computational Linguistics]] (ICCL) in the 1970s and 1980s. What started as an effort to translate between languages evolved into a much wider field of [[natural language processing]].<ref>[http://www-nlpir.nist.gov/MINDS/FINAL/NLP.web.pdf Natural Language Processing by Liz Liddy, Eduard Hovy, Jimmy Lin, John Prager, Dragomir Radev, Lucy Vanderwende, Ralph Weischedel]</ref><ref>Arnold B. Barach: [https://www.flickr.com/photos/bostworld/2152048032/in/set-72157603898383698/ Translating Machine] 1975: And the Changes To Come.</ref> ==Annotated corpora== In order to be able to meticulously study the [[English language]], an annotated text corpus was much needed. The Penn [[Treebank]]<ref>{{cite journal|author1=Marcus, M.|author2=Marcinkiewicz, M.|name-list-style=amp|year=1993|url=https://www.aclweb.org/anthology/J/J93/J93-2004.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://www.aclweb.org/anthology/J/J93/J93-2004.pdf |archive-date=2022-10-09 |url-status=live|title=Building a large annotated corpus of English: The Penn Treebank|journal=Computational Linguistics|volume=19|issue=2|pages=313–330}}</ref> was one of the most used corpora. It consisted of IBM computer manuals, transcribed telephone conversations, and other texts, together containing over 4.5 million words of American English, annotated using both [[part-of-speech]] tagging and syntactic bracketing.<ref>{{cite book|last1=Taylor|first1=Ann|title=Treebanks|date=2003|publisher=Spring Netherlands|pages=5–22|chapter=1}}</ref> Japanese sentence corpora were analyzed and a pattern of [[log-normality]] was found in relation to sentence length.<ref name="autogenerated3">{{cite journal|author1=Furuhashi, S.|author2=Hayakawa, Y. |name-list-style=amp|year=2012|title=Lognormality of the Distribution of Japanese Sentence Lengths|journal=Journal of the Physical Society of Japan|volume=81|issue=3|page=034004|doi=10.1143/JPSJ.81.034004|bibcode=2012JPSJ...81c4004F }}</ref> ==Modeling language acquisition== The fact that during [[language acquisition]], children are largely only exposed to positive evidence,<ref>Bowerman, M. (1988). [http://pubman.mpdl.mpg.de/pubman/item/escidoc:468143:4/component/escidoc:532427/bowerman_1988_The-No.pdf The "no negative evidence" problem: How do children avoid constructing an overly general grammar. Explaining language universals].</ref> meaning that the only evidence for what is a correct form is provided, and no evidence for what is not correct,<ref name="autogenerated1971">Braine, M.D.S. (1971). On two types of models of the internalization of grammars. In D.I. Slobin (Ed.), The ontogenesis of grammar: A theoretical perspective. New York: Academic Press.</ref> was a limitation for the models at the time because the now available [[deep learning]] models were not available in late 1980s.<ref name="powers1989">Powers, D.M.W. & Turk, C.C.R. (1989). ''Machine Learning of Natural Language''. Springer-Verlag. {{ISBN|978-0-387-19557-5}}.</ref> It has been shown that languages can be learned with a combination of simple input presented incrementally as the child develops better memory and longer attention span,<ref name="autogenerated1993">{{cite journal|title= Learning and development in neural networks: The importance of starting small|journal= Cognition|volume= 48|issue= 1|pages= 71–99|doi= 10.1016/0010-0277(93)90058-4|pmid= 8403835|year= 1993|last1= Elman|first1= Jeffrey L.|s2cid= 2105042|citeseerx= 10.1.1.135.4937}}</ref> which explained the long period of [[language acquisition]] in human infants and children.<ref name="autogenerated1993"/> Robots have been used to test linguistic theories.<ref>{{cite journal | last1 = Salvi | first1 = G. | last2 = Montesano | first2 = L. | last3 = Bernardino | first3 = A. | last4 = Santos-Victor | first4 = J. | year = 2012 | title = Language bootstrapping: learning word meanings from the perception-action association | journal = IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics | volume = 42 | issue = 3| pages = 660–71 | doi = 10.1109/TSMCB.2011.2172420 | pmid = 22106152 | arxiv = 1711.09714 | s2cid = 977486 }}</ref> Enabled to learn as children might, models were created based on an [[affordance]] model in which mappings between actions, perceptions, and effects were created and linked to spoken words. Crucially, these robots were able to acquire functioning word-to-meaning mappings without needing grammatical structure. Using the [[Price equation]] and [[Pólya urn]] dynamics, researchers have created a system which not only predicts future linguistic evolution but also gives insight into the evolutionary history of modern-day languages.<ref>{{cite journal|author1=Gong, T.|author2=Shuai, L.|author3=Tamariz, M.|author4=Jäger, G.|name-list-style=amp|year=2012|title=Studying Language Change Using Price Equation and Pólya-urn Dynamics|editor=E. Scalas|journal=PLOS ONE|volume=7|issue=3|page=e33171|doi=10.1371/journal.pone.0033171|pmid=22427981|pmc=3299756|bibcode=2012PLoSO...733171G|doi-access=free}}</ref> ==Chomsky's theories== [[Noam Chomsky]]'s theories have influenced computational linguistics, particularly in understanding how infants learn complex grammatical structures, such as those described in [[Chomsky normal form]].<ref>{{cite web |last1=Yogita |first1=Bansal |title=Insight to Computational Linguistics |url=https://d1wqtxts1xzle7.cloudfront.net/50283410/ijeter034102016-libre.pdf?1479039496=&response-content-disposition=inline%3B+filename%3DInsight_to_Computational_Linguistics.pdf&Expires=1727013507&Signature=OpBNq-Ocozu3StViVzaoeet1B7yVJUvnnLUxYpQKaTUr71Cho6YFoTZPv2k6ZzXtkxuZ3ViZNDJp~t5nLAyLxLk0mxGR6oVMQK4Rk68RaaCZVebBMvFMqKyRHGhwpbFLMbibo5eD7MHZQBPAxDwjBDGtX0TjORdrQ2XUCLw~vM7AtWsP3wtTj-TeHXSfQiL8DiyuvjEZEoqQ1NGhE2S1po~kTs5Eov-WFvYrfm4McdL~ztLUTdUmHyd3ntg0zI9pNPZG7CtouiHWtEA26fXOZEbD5Qv9C1~gnV8VTSLzxWSMwEe3od6vPKoW1jlngnLLK9VoldGapnaUjJtWtW2MKw__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA |publisher=International Journal 4.10 |access-date=September 22, 2024 |page=94 |date=2016}}</ref> Attempts have been made to determine how an infant learns a "non-normal grammar" as theorized by Chomsky normal form.<ref name="autogenerated1971"/> Research in this area combines structural approaches with computational models to analyze large [[Corpus linguistics|linguistic corpora]] like the Penn [[Treebank]], helping to uncover patterns in language acquisition.<ref>{{cite web |last1=Yogita |first1=Bansal |title=Insight to Computational Linguistics |url=https://d1wqtxts1xzle7.cloudfront.net/50283410/ijeter034102016-libre.pdf?1479039496=&response-content-disposition=inline%3B+filename%3DInsight_to_Computational_Linguistics.pdf&Expires=1727013507&Signature=OpBNq-Ocozu3StViVzaoeet1B7yVJUvnnLUxYpQKaTUr71Cho6YFoTZPv2k6ZzXtkxuZ3ViZNDJp~t5nLAyLxLk0mxGR6oVMQK4Rk68RaaCZVebBMvFMqKyRHGhwpbFLMbibo5eD7MHZQBPAxDwjBDGtX0TjORdrQ2XUCLw~vM7AtWsP3wtTj-TeHXSfQiL8DiyuvjEZEoqQ1NGhE2S1po~kTs5Eov-WFvYrfm4McdL~ztLUTdUmHyd3ntg0zI9pNPZG7CtouiHWtEA26fXOZEbD5Qv9C1~gnV8VTSLzxWSMwEe3od6vPKoW1jlngnLLK9VoldGapnaUjJtWtW2MKw__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA |publisher=International Journal 4.10 |access-date=September 22, 2024 |page=94 |date=2016}}</ref> ==See also== {{Portal|Philosophy}} {{div col|colwidth=22em}} * [[Artificial intelligence in fiction]] * [[Collostructional analysis]] * [[Computational lexicology]] * [[Computational Linguistics (journal)|''Computational Linguistics'' (journal)]] * [[Computational models of language acquisition]] * [[Computational semantics]] * [[Computational semiotics]] * [[Computer-assisted reviewing]] * [[Dialog systems]] * [[Glottochronology]] * [[Grammar induction]] * [[Human speechome project]] * [[Internet linguistics]] * [[Lexicostatistics]] * [[Natural language processing]] * [[Natural language user interface]] * [[Quantitative linguistics]] * [[Semantic relatedness]] * [[Semantometrics]] * [[Systemic functional linguistics]] * [[Translation memory]] * [[Universal Networking Language]] {{div col end}} ==References== {{reflist|30em}} ==Further reading== <!-- In alphabetical order of by last name --> {{Refbegin}} * {{cite journal | last1 = Bates | first1 = M | year = 1995 | title = Models of natural language understanding | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 92 | issue = 22| pages = 9977–9982 | doi=10.1073/pnas.92.22.9977| pmid = 7479812 | pmc = 40721| bibcode = 1995PNAS...92.9977B | doi-access = free }} * Steven Bird, Ewan Klein, and Edward Loper (2009). ''Natural Language Processing with Python''. O'Reilly Media. {{ISBN|978-0-596-51649-9}}. * Daniel Jurafsky and James H. Martin (2008). ''Speech and Language Processing'', 2nd edition. Pearson Prentice Hall. {{ISBN|978-0-13-187321-6}}. * Mohamed Zakaria KURDI (2016). ''Natural Language Processing and Computational Linguistics: speech, morphology, and syntax'', Volume 1. ISTE-Wiley. {{ISBN|978-1848218482}}. * Mohamed Zakaria KURDI (2017). ''Natural Language Processing and Computational Linguistics: semantics, discourse, and applications'', Volume 2. ISTE-Wiley. {{ISBN| 978-1848219212}}. {{Refend}} ==External links== {{Wikiversity}} {{commons category}} * [http://www.aclweb.org/ Association for Computational Linguistics (ACL)] ** [http://www.aclweb.org/anthology ACL Anthology of research papers] ** [http://aclweb.org/aclwiki/ ACL Wiki for Computational Linguistics] * [http://www.CICLing.org/ CICLing annual conferences on Computational Linguistics] {{Webarchive|url=https://web.archive.org/web/20190206002457/http://www.cicling.org/ |date=2019-02-06 }} * [https://web.archive.org/web/20110122142133/http://www.cla.imcsit.org/ Computational Linguistics – Applications workshop] * {{webarchive |url=https://web.archive.org/web/20080125103030/http://www.gelbukh.com/clbook/ |date=January 25, 2008 |title=Free online introductory book on Computational Linguistics }} * [https://web.archive.org/web/20180212202132/http://www.lt-world.org/ Language Technology World] * [https://web.archive.org/web/20191025033136/http://www.cs.technion.ac.il/~gabr/resources/resources.html Resources for Text, Speech and Language Processing] * [http://clg.wlv.ac.uk/ The Research Group in Computational Linguistics] {{Webarchive|url=https://web.archive.org/web/20130801110817/http://clg.wlv.ac.uk/ |date=2013-08-01 }} {{Computer science}} {{Authority control}} {{DEFAULTSORT:Computational Linguistics}} [[Category:Computational linguistics| ]] [[Category:Formal sciences]] [[Category:Cognitive science]] [[Category:Computational fields of study]] [[Category:Mathematical linguistics|*]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:About
(
edit
)
Template:Authority control
(
edit
)
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Commons category
(
edit
)
Template:Computer science
(
edit
)
Template:Div col
(
edit
)
Template:Div col end
(
edit
)
Template:ISBN
(
edit
)
Template:Linguistics
(
edit
)
Template:Portal
(
edit
)
Template:Refbegin
(
edit
)
Template:Refend
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Sidebar with collapsible lists
(
edit
)
Template:Sister project
(
edit
)
Template:Webarchive
(
edit
)
Template:Wikiversity
(
edit
)