Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Mel-frequency cepstrum
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==History== Paul Mermelstein<ref name=merm76>P. Mermelstein (1976), "[https://books.google.com/books?id=wW9QAAAAMAAJ&q=%22Distance+measures+for+speech+recognition,+psychological+and+instrumental%22 Distance measures for speech recognition, psychological and instrumental," in ''Pattern Recognition and Artificial Intelligence''], C. H. Chen, Ed., pp. 374β388. Academic, New York.</ref><ref name=merm80>S.B. Davis, and P. Mermelstein (1980), "[https://books.google.com/books?id=yjzCra5eW3AC&dq=cosine+mel+pols&pg=PA65 Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences]," in ''IEEE Transactions on Acoustics, Speech, and Signal Processing'', 28(4), pp. 357β366.</ref> is typically credited with the development of the MFC. Mermelstein credits Bridle and Brown<ref>J. S. Bridle and M. D. Brown (1974), "An Experimental Automatic Word-Recognition System", JSRU Report No. 1003, Joint Speech Research Unit, Ruislip, England.</ref> for the idea: <blockquote> Bridle and Brown used a set of 19 weighted spectrum-shape coefficients given by the cosine transform of the outputs of a set of nonuniformly spaced bandpass filters. The filter spacing is chosen to be logarithmic above 1 kHz and the filter bandwidths are increased there as well. We will, therefore, call these the mel-based cepstral parameters.<ref name=merm76/> </blockquote> Sometimes both early originators are cited.<ref>{{cite book | chapter = Automatic Speech Recognition: An Auditory Perspective |author1=Nelson Morgan |author2=HervΓ© Bourlard |author3=Hynek Hermansky |name-list-style=amp | title = Speech Processing in the Auditory System |editor1=Steven Greenberg |editor2=William A. Ainsworth | publisher = Springer | year = 2004 | isbn = 978-0-387-00590-4 | page = 315 | chapter-url = https://books.google.com/books?id=xWU2o08AxwwC&dq=mel-frequency+Mermelstein+Bridle&pg=PA315|author1-link=Nelson Morgan }}</ref> Many authors, including Davis and Mermelstein,<ref name=merm80/> have commented that the spectral basis functions of the cosine transform in the MFC are very similar to the [[principal components]] of the log spectra, which were applied to speech representation and recognition much earlier by Pols and his colleagues.<ref>L. C. W. Pols (1966), "Spectral Analysis and Identification of Dutch Vowels in Monosyllabic Words," Doctoral dissertation, Free University, Amsterdam, the Netherlands</ref><ref>R. Plomp, L. C. W. Pols, and J. P. van de Geer (1967). "[http://dare.uva.nl/document/36194 Dimensional analysis of vowel spectra]." ''J. Acoustical Society of America,'' 41(3):707β712.</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)