Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Speech recognition
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Education=== {{main|Pronunciation assessment}} Automatic [[pronunciation]] assessment is the use of speech recognition to verify the correctness of pronounced speech,<ref>{{Citation |last1=El Kheir |first1=Yassine |title=Automatic Pronunciation Assessment — A Review |date=October 21, 2023 |publisher=Conference on Empirical Methods in Natural Language Processing |arxiv=2310.13974 |s2cid=264426545 |display-authors=1 |last2=Ali |first2=Ahmed}}</ref> as distinguished from manual assessment by an instructor or proctor.<ref>{{Cite journal |last1=Isaacs |first1=Talia |last2=Harding |first2=Luke |date=July 2017 |title=Pronunciation assessment |journal=Language Teaching |language=en |volume=50 |issue=3 |pages=347–366 |doi=10.1017/S0261444817000118 |issn=0261-4448 |s2cid=209353525 |doi-access=free}}</ref> Also called speech verification, pronunciation evaluation, and pronunciation scoring, the main application of this technology is computer-aided pronunciation teaching (CAPT) when combined with [[computer-aided instruction]] for [[computer-assisted language learning]] (CALL), speech [[Remedial education|remediation]], or [[accent reduction]]. Pronunciation assessment does not determine unknown speech (as in [[Digital dictation|dictation]] or [[automatic transcription]]) but instead, knowing the expected word(s) in advance, it attempts to verify the correctness of the learner's pronunciation and ideally their [[Intelligibility (communication)|intelligibility]] to listeners,<ref>{{Citation |last1=Loukina |first1=Anastassia |title=INTERSPEECH 2015 |date=September 6, 2015 |pages=1917–1921 |chapter=Pronunciation accuracy and intelligibility of non-native speech |chapter-url=https://www.isca-speech.org/archive/pdfs/interspeech_2015/loukina15_interspeech.pdf |place=Dresden, Germany |publisher=[[International Speech Communication Association]] |quote=only 16% of the variability in word-level intelligibility can be explained by the presence of obvious mispronunciations. |display-authors=1 |last2=Lopez |first2=Melissa |access-date=9 September 2024 |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909053932/https://www.isca-speech.org/archive/pdfs/interspeech_2015/loukina15_interspeech.pdf |url-status=live }}</ref><ref name="obrien">{{Cite journal |last1=O’Brien |first1=Mary Grantham |last2=Derwing |first2=Tracey M. |display-authors=1 |date=31 December 2018 |title=Directions for the future of technology in pronunciation research and teaching |journal=Journal of Second Language Pronunciation |language=en |volume=4 |issue=2 |pages=182–207 |doi=10.1075/jslp.17001.obr |issn=2215-1931 |s2cid=86440885 |quote=pronunciation researchers are primarily interested in improving L2 learners’ intelligibility and comprehensibility, but they have not yet collected sufficient amounts of representative and reliable data (speech recordings with corresponding annotations and judgments) indicating which errors affect these speech dimensions and which do not. These data are essential to train ASR algorithms to assess L2 learners’ intelligibility. |doi-access=free |hdl-access=free |hdl=2066/199273}}</ref> sometimes along with often inconsequential [[Prosody (linguistics)|prosody]] such as [[Intonation (linguistics)|intonation]], [[Pitch (music)|pitch]], [[Speech tempo|tempo]], [[Isochrony|rhythm]], and [[Vocal stress|stress]].<ref>{{Cite journal |last=Eskenazi |first=Maxine |date=January 1999 |title=Using automatic speech processing for foreign language pronunciation tutoring: Some issues and a prototype |url=https://www.lltjournal.org/item/10125-25043/ |journal=Language Learning & Technology |language=en |volume=2 |issue=2 |pages=62–76 |access-date=11 February 2023 |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909053942/https://www.lltjournal.org/item/10125-25043/ |url-status=live }}</ref> Pronunciation assessment is also used in [[reading tutoring]], for example in products such as [[Microsoft Teams]]<ref>{{Cite news |last=Tholfsen |first=Mike |date=9 February 2023 |title=Reading Coach in Immersive Reader plus new features coming to Reading Progress in Microsoft Teams |url=https://techcommunity.microsoft.com/t5/education-blog/reading-coach-in-immersive-reader-plus-new-features-coming-to/ba-p/3734079 |access-date=12 February 2023 |work=Techcommunity Education Blog |publisher=Microsoft |language=en |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909052822/https://techcommunity.microsoft.com/t5/education-blog/reading-coach-in-immersive-reader-plus-new-features-coming-to/ba-p/3734079 |url-status=live }}</ref> and from Amira Learning.<ref>{{Cite news |last=Banerji |first=Olina |date=7 March 2023 |title=Schools Are Using Voice Technology to Teach Reading. Is It Helping? |url=https://www.edsurge.com/news/2023-03-07-schools-are-using-voice-technology-to-teach-reading-is-it-helping |access-date=7 March 2023 |work=EdSurge News |language=en |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909054611/https://www.edsurge.com/news/2023-03-07-schools-are-using-voice-technology-to-teach-reading-is-it-helping |url-status=live }}</ref> Automatic pronunciation assessment can also be used to help diagnose and treat [[speech disorders]] such as [[speech apraxia|apraxia]].<ref>{{Cite book |last1=Hair |first1=Adam |url=https://psi.engr.tamu.edu/wp-content/uploads/2018/04/hair2018idc.pdf |title=Proceedings of the 17th ACM Conference on Interaction Design and Children |last2=Monroe |first2=Penelope |date=19 June 2018 |isbn=9781450351522 |pages=119–131 |chapter=Apraxia world: A speech therapy game for children with speech sound disorders |doi=10.1145/3202185.3202733 |display-authors=1 |s2cid=13790002 |access-date=9 September 2024 |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909052803/https://psi.engr.tamu.edu/wp-content/uploads/2018/04/hair2018idc.pdf |url-status=live }}</ref> Assessing authentic listener intelligibility is essential for avoiding inaccuracies from [[Accent (sociolinguistics)|accent]] bias, especially in high-stakes assessments;<ref>{{Cite news |date=8 August 2017 |title=Computer says no: Irish vet fails oral English test needed to stay in Australia |url=https://www.theguardian.com/australia-news/2017/aug/08/computer-says-no-irish-vet-fails-oral-english-test-needed-to-stay-in-australia |access-date=12 February 2023 |work=The Guardian |agency=Australian Associated Press |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909052806/https://www.theguardian.com/australia-news/2017/aug/08/computer-says-no-irish-vet-fails-oral-english-test-needed-to-stay-in-australia |url-status=live }}</ref><ref>{{Cite news |last=Ferrier |first=Tracey |date=9 August 2017 |title=Australian ex-news reader with English degree fails robot's English test |url=https://www.smh.com.au/technology/australian-exnews-reader-with-english-degree-fails-robots-english-test-20170809-gxsjv2.html |access-date=12 February 2023 |work=The Sydney Morning Herald |language=en |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909053307/https://www.smh.com.au/technology/australian-exnews-reader-with-english-degree-fails-robots-english-test-20170809-gxsjv2.html |url-status=live }}</ref><ref>{{Cite news |last1=Main |first1=Ed |last2=Watson |first2=Richard |date=9 February 2022 |title=The English test that ruined thousands of lives |url=https://www.bbc.com/news/uk-60264106 |access-date=12 February 2023 |work=BBC News |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909054614/https://www.bbc.com/news/uk-60264106 |url-status=live }}</ref> from words with multiple correct pronunciations;<ref>{{Cite web |last=Joyce |first=Katy Spratte |date=January 24, 2023 |title=13 Words That Can Be Pronounced Two Ways |url=https://www.rd.com/list/words-that-can-be-pronounced-two-ways/ |access-date=23 February 2023 |publisher=Reader's Digest |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909054447/https://www.rd.com/list/words-that-can-be-pronounced-two-ways/ |url-status=live }}</ref> and from phoneme coding errors in machine-readable pronunciation dictionaries.<ref>E.g., [[CMU Pronouncing Dictionary|CMUDICT]], {{Cite web |title=The CMU Pronouncing Dictionary |url=http://www.speech.cs.cmu.edu/cgi-bin/cmudict |access-date=15 February 2023 |website=www.speech.cs.cmu.edu |archive-date=15 August 2010 |archive-url=https://web.archive.org/web/20100815023012/http://www.speech.cs.cmu.edu/cgi-bin/cmudict |url-status=live }} Compare "four" given as "F AO R" with the vowel AO as in "caught," to "row" given as "R OW" with the vowel OW as in "oat."</ref> In 2022, researchers found that some newer speech to text systems, based on [[end-to-end reinforcement learning]] to map audio signals directly into words, produce word and phrase confidence scores very closely correlated with genuine listener intelligibility.<ref>{{Cite conference |last1=Tu |first1=Zehai |last2=Ma |first2=Ning |last3=Barker |first3=Jon |date=2022 |title=Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction |url=https://www.isca-speech.org/archive/pdfs/interspeech_2022/tu22b_interspeech.pdf |conference=INTERSPEECH 2022 |publisher=ISCA |pages=3493–3497 |doi=10.21437/Interspeech.2022-10408 |access-date=17 December 2023 |book-title=Proc. Interspeech 2022 |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909053824/https://www.isca-speech.org/archive/pdfs/interspeech_2022/tu22b_interspeech.pdf |url-status=live }}</ref> In the [[Common European Framework of Reference for Languages]] (CEFR) assessment criteria for "overall phonological control", intelligibility outweighs formally correct pronunciation at all levels.<ref>{{Cite book |url=https://rm.coe.int/cefr-companion-volume-with-new-descriptors-2018/1680787989 |title=Common European framework of reference for languages learning, teaching, assessment: Companion volume with new descriptors |date=February 2018 |publisher=Language Policy Programme, Education Policy Division, Education Department, [[Council of Europe]] |page=136 |oclc=1090351600 |access-date=9 September 2024 |archive-date=9 September 2024 |archive-url=https://web.archive.org/web/20240909053825/https://rm.coe.int/cefr-companion-volume-with-new-descriptors-2018/1680787989 |url-status=live }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)