Editing Speech recognition (section)

===Pre-1970===
* '''1952''' – Three Bell Labs researchers, Stephen Balashek,<ref>{{Cite news |date=22 July 2012 |title=Obituaries: Stephen Balashek |url=https://obits.nj.com/obituaries/starledger/obituary.aspx?page=lifestory&pid=158702138 |work=The Star-Ledger |access-date=9 September 2024 |archive-date=4 April 2019 |archive-url=https://web.archive.org/web/20190404231352/https://obits.nj.com/obituaries/starledger/obituary.aspx?page=lifestory&pid=158702138 |url-status=live }}</ref> R. Biddulph, and K. H. Davis built a system called "Audrey"<ref>{{Cite web |title=IBM-Shoebox-front.jpg |url=https://cdn57.androidauthority.net/wp-content/uploads/2012/04/IBM-Shoebox-front.jpg |access-date=4 April 2019 |publisher=androidauthority.net |archive-date=9 August 2018 |archive-url=https://web.archive.org/web/20180809153221/https://cdn57.androidauthority.net/wp-content/uploads/2012/04/IBM-Shoebox-front.jpg |url-status=live }}</ref> for single-speaker digit recognition. Their system located the [[formants]] in the power spectrum of each utterance.<ref>{{Cite web |last1=Juang |first1=B. H. |last2=Rabiner |first2=Lawrence R. |title=Automatic speech recognition–a brief history of the technology development |url=http://www.ece.ucsb.edu/faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |url-status=live |archive-url=https://web.archive.org/web/20140817193243/http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |archive-date=17 August 2014 |access-date=17 January 2015 |page=6}}</ref> 
* '''1960''' – [[Gunnar Fant]] developed and published the [[source-filter model of speech production]]. 
* '''1962''' – [[IBM]] demonstrated its 16-word "Shoebox" machine's speech recognition capability at the [[1962 World's Fair]].<ref name="PCW.Siri">{{Cite magazine |last=Melanie Pinola |date=2 November 2011 |title=Speech Recognition Through the Decades: How We Ended Up With Siri |url=https://www.pcworld.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html |access-date=22 October 2018 |magazine=PC World |archive-date=3 November 2018 |archive-url=https://web.archive.org/web/20181103105727/https://www.pcworld.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html |url-status=live }}</ref>
* '''1966''' – [[Linear predictive coding]] (LPC), a [[speech coding]] method, was first proposed by [[Fumitada Itakura]] of [[Nagoya University]] and Shuzo Saito of [[Nippon Telegraph and Telephone]] (NTT), while working on speech recognition.<ref name="Gray">{{Cite journal |last=Gray |first=Robert M. |date=2010 |title=A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |url=https://ee.stanford.edu/~gray/lpcip.pdf |journal=Found. Trends Signal Process. |volume=3 |issue=4 |pages=203–303 |doi=10.1561/2000000036 |issn=1932-8346 |doi-access=free |access-date=9 September 2024 |archive-date=9 October 2022 |archive-url=https://ghostarchive.org/archive/20221009/https://ee.stanford.edu/~gray/lpcip.pdf |url-status=live }}</ref>
* '''1969''' – Funding at [[Bell Labs]] dried up for several years when, in 1969, the influential [[John R. Pierce|John Pierce]] wrote an open letter that was critical of and defunded speech recognition research.<ref name="jasapierce">{{Cite journal |last=John R. Pierce |author-link=John R. Pierce |date=1969 |title=Whither speech recognition? |journal=Journal of the Acoustical Society of America |volume=46 |issue=48 |pages=1049–1051 |bibcode=1969ASAJ...46.1049P |doi=10.1121/1.1911801}}</ref> This defunding lasted until Pierce retired and [[James L. Flanagan]] took over.

[[Raj Reddy]] was the first person to take on continuous speech recognition as a graduate student at [[Stanford University]] in the late 1960s. Previous systems required users to pause after each word. Reddy's system issued spoken commands for playing [[chess]].

Around this time Soviet researchers invented the [[dynamic time warping]] (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary.<ref>{{Cite book |last1=Benesty |first1=Jacob |title=Springer Handbook of Speech Processing |last2=Sondhi |first2=M. M. |last3=Huang |first3=Yiteng |date=2008 |publisher=Springer Science & Business Media |isbn=978-3540491255}}</ref> DTW processed speech by dividing it into short frames, e.g. 10ms segments, and processing each frame as a single unit. Although DTW would be superseded by later algorithms, the technique carried on. Achieving speaker independence remained unsolved at this time period.