Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Speech recognition
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===1970β1990=== * '''1971''' β [[DARPA]] funded five years for ''Speech Understanding Research'', speech recognition research seeking a minimum vocabulary size of 1,000 words. They thought [[natural-language understanding|speech ''understanding'']] would be key to making progress in speech ''recognition'', but this later proved untrue.<ref>{{Cite web |last=John Makhoul |title=ISCA Medalist: For leadership and extensive contributions to speech and language processing |url=https://www.superlectures.com/interspeech2016/isca-medalist-for-leadership-and-extensive-contributions-to-speech-and-language-processing |url-status=live |archive-url=https://web.archive.org/web/20180124071005/https://www.superlectures.com/interspeech2016/isca-medalist-for-leadership-and-extensive-contributions-to-speech-and-language-processing |archive-date=24 January 2018 |access-date=23 January 2018 |df=dmy-all}}</ref> [[BBN Technologies|BBN]], [[IBM]], [[Carnegie Mellon]] and [[Stanford Research Institute]] all participated in the program.<ref>{{Cite magazine |last1=Blechman |first1=R. O. |last2=Blechman |first2=Nicholas |date=23 June 2008 |title=Hello, Hal |url=https://www.newyorker.com/magazine/2008/06/23/hello-hal |url-status=live |archive-url=https://web.archive.org/web/20150120042048/http://www.newyorker.com/magazine/2008/06/23/hello-hal |archive-date=20 January 2015 |access-date=17 January 2015 |magazine=The New Yorker |df=dmy-all}}</ref><ref>{{Cite journal |last=Klatt |first=Dennis H. |year=1977 |title=Review of the ARPA speech understanding project |journal=The Journal of the Acoustical Society of America |volume=62 |issue=6 |pages=1345β1366 |bibcode=1977ASAJ...62.1345K |doi=10.1121/1.381666}}</ref> This revived speech recognition research post John Pierce's letter. * '''1972''' β The IEEE Acoustics, Speech, and Signal Processing group held a conference in Newton, Massachusetts. * '''1976''' β The first [[ICASSP]] was held in [[Philadelphia]], which since then has been a major venue for the publication of research on speech recognition.<ref>{{Cite web |last=Rabiner |date=1984 |title=The Acoustics, Speech, and Signal Processing Society. A Historical Perspective |url=http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/216_historical%20perspective.pdf |url-status=live |archive-url=https://web.archive.org/web/20170809113828/http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/216_historical%20perspective.pdf |archive-date=9 August 2017 |access-date=23 January 2018 |df=dmy-all}}</ref> During the late 1960s [[Leonard E. Baum|Leonard Baum]] developed the mathematics of [[Markov chain]]s at the [[Institute for Defense Analysis]]. A decade later, at CMU, Raj Reddy's students [[James K. Baker|James Baker]] and [[Janet M. Baker]] began using the [[hidden Markov model]] (HMM) for speech recognition.<ref>{{Cite web |date=12 January 2015 |title=First-Hand:The Hidden Markov Model β Engineering and Technology History Wiki |url=http://ethw.org/First-Hand:The_Hidden_Markov_Model |url-status=live |archive-url=https://web.archive.org/web/20180403191314/http://ethw.org/First-Hand:The_Hidden_Markov_Model |archive-date=3 April 2018 |access-date=1 May 2018 |website=ethw.org |df=dmy-all}}</ref> James Baker had learned about HMMs from a summer job at the Institute of Defense Analysis during his undergraduate education.<ref name="James Baker interview" /> The use of HMMs allowed researchers to combine different sources of knowledge, such as acoustics, language, and syntax, in a unified probabilistic model. * By the '''mid-1980s''' IBM's [[Frederick Jelinek|Fred Jelinek's]] team created a voice activated typewriter called Tangora, which could handle a 20,000-word vocabulary<ref>{{Cite web |date=2012-03-07 |title=Pioneering Speech Recognition |url=http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/speechreco/ |url-status=dead |archive-url=https://web.archive.org/web/20150219080748/http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/speechreco/ |archive-date=19 February 2015 |access-date=18 January 2015 |df=dmy-all}}</ref> Jelinek's statistical approach put less emphasis on emulating the way the human brain processes and understands speech in favor of using statistical modeling techniques like HMMs. (Jelinek's group independently discovered the application of HMMs to speech.<ref name="James Baker interview">{{Cite web |title=James Baker interview |url=http://www.sarasinstitute.org/Audio/JimBaker(2006).mp3 |url-status=live |archive-url=https://web.archive.org/web/20170828105222/http://www.sarasinstitute.org/Audio/JimBaker(2006).mp3 |archive-date=28 August 2017 |access-date=9 February 2017 |df=dmy-all}}</ref>) This was controversial with linguists since HMMs are too simplistic to account for many common features of human languages.<ref>{{Cite journal |last1=Huang |first1=Xuedong |last2=Baker |first2=James |last3=Reddy |first3=Raj |date=January 2014 |title=A historical perspective of speech recognition |url=https://dl.acm.org/doi/fullHtml/10.1145/2500887 |journal=Communications of the ACM |language=en |volume=57 |issue=1 |pages=94β103 |doi=10.1145/2500887 |issn=0001-0782 |s2cid=6175701 |archive-url=https://web.archive.org/web/20231208161616/https://dl.acm.org/doi/fullHtml/10.1145/2500887 |archive-date=2023-12-08}}</ref> However, the HMM proved to be a highly useful way for modeling speech and replaced dynamic time warping to become the dominant speech recognition algorithm in the 1980s.<ref>{{Cite report |url=http://www.ece.ucsb.edu/faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |title=Automatic speech recognitionβa brief history of the technology development |last1=Juang |first1=B. H. |last2=Rabiner |first2=Lawrence R. |page=10 |access-date=17 January 2015 |archive-url=https://web.archive.org/web/20140817193243/http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |archive-date=17 August 2014 |url-status=live}}</ref><ref>{{Cite journal |last=Li |first=Xiaochang |date=2023-07-01 |title="There's No Data Like More Data": Automatic Speech Recognition and the Making of Algorithmic Culture |url=https://www.journals.uchicago.edu/doi/10.1086/725132 |journal=Osiris |language=en |volume=38 |pages=165β182 |doi=10.1086/725132 |issn=0369-7827 |s2cid=259502346}}</ref> * '''1982''' β Dragon Systems, founded by James and [[Janet M. Baker]],<ref>{{Cite web |title=History of Speech Recognition |url=http://www.dragon-medical-transcription.com/history_speech_recognition.html |archive-url=https://web.archive.org/web/20150813223326/http://dragon-medical-transcription.com/history_speech_recognition.html |archive-date=13 August 2015 |access-date=17 January 2015 |website=Dragon Medical Transcription}}</ref> was one of IBM's few competitors.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)