Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Speech synthesis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Articulatory synthesis === {{Main|Articulatory synthesis}} Articulatory synthesis consists of computational techniques for synthesizing speech based on models of the human [[vocal tract]] and the articulation processes occurring there. The first articulatory synthesizer regularly used for laboratory experiments was developed at [[Haskins Laboratories]] in the mid-1970s by [[Philip Rubin]], Tom Baer, and Paul Mermelstein. This synthesizer, known as ASY, was based on vocal tract models developed at [[Bell Laboratories]] in the 1960s and 1970s by Paul Mermelstein, Cecil Coker, and colleagues. Until recently, articulatory synthesis models have not been incorporated into commercial speech synthesis systems. A notable exception is the [[NeXT]]-based system originally developed and marketed by Trillium Sound Research, a spin-off company of the [[University of Calgary]], where much of the original research was conducted. Following the demise of the various incarnations of NeXT (started by [[Steve Jobs]] in the late 1980s and merged with Apple Computer in 1997), the Trillium software was published under the GNU General Public License, with work continuing as [[gnuspeech]]. The system, first marketed in 1994, provides full articulatory-based text-to-speech conversion using a waveguide or transmission-line analog of the human oral and nasal tracts controlled by CarrΓ©'s "distinctive region model". More recent synthesizers, developed by Jorge C. Lucero and colleagues, incorporate models of vocal fold biomechanics, glottal aerodynamics and acoustic wave propagation in the bronchi, trachea, nasal and oral cavities, and thus constitute full systems of physics-based speech simulation.<ref name=":0">{{Cite journal|url = http://www.cic.unb.br/~lucero/papers/768_Paper.pdf|title = Physics-based synthesis of disordered voices|last1 = Lucero|first1 = J. C.|date = 2013|journal = Interspeech 2013|access-date = Aug 27, 2015|last2 = Schoentgen|first2 = J.|last3 = Behlau|first3 = M.|pages = 587β591|publisher = International Speech Communication Association|location = Lyon, France|doi = 10.21437/Interspeech.2013-161| s2cid=17451802 }}</ref><ref name=":1">{{Cite journal|last1=Englert|first1=Marina|last2=Madazio|first2=Glaucya|last3=Gielow|first3=Ingrid|last4=Lucero|first4=Jorge|last5=Behlau|first5=Mara|date=2016|title=Perceptual error identification of human and synthesized voices|journal=Journal of Voice|volume=30|issue=5|pages=639.e17β639.e23|doi=10.1016/j.jvoice.2015.07.017|pmid=26337775}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)