Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Affective computing
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Speech descriptors==== The complexity of the affect recognition process increases with the number of classes (affects) and speech descriptors used within the classifier. It is, therefore, crucial to select only the most relevant features in order to assure the ability of the model to successfully identify emotions, as well as increasing the performance, which is particularly significant to real-time detection. The range of possible choices is vast, with some studies mentioning the use of over 200 distinct features.<ref name="Scherer-2010-p241"/> It is crucial to identify those that are redundant and undesirable in order to optimize the system and increase the success rate of correct emotion detection. The most common speech characteristics are categorized into the following groups.<ref name="Steidl-2011"/><ref name="Scherer-2010-p243"/> # Frequency characteristics<ref>{{Cite book |doi=10.1109/ICCCI50826.2021.9402569|isbn=978-1-7281-5875-4|chapter=Non-linear frequency warping using constant-Q transformation for speech emotion recognition|title=2021 International Conference on Computer Communication and Informatics (ICCCI)|pages=1β4|year=2021|last1=Singh|first1=Premjeet|last2=Saha|first2=Goutam|last3=Sahidullah|first3=Md|arxiv=2102.04029|s2cid=231846518}}</ref> #* Accent shape β affected by the rate of change of the fundamental frequency. #* Average pitch β description of how high/low the speaker speaks relative to the normal speech. #* Contour slope β describes the tendency of the frequency change over time, it can be rising, falling or level. #* Final lowering β the amount by which the frequency falls at the end of an utterance. #* Pitch range β measures the spread between the maximum and minimum frequency of an utterance. # Time-related features: #* Speech rate β describes the rate of words or syllables uttered over a unit of time #* Stress frequency β measures the rate of occurrences of pitch accented utterances # Voice quality parameters and energy descriptors: #* Breathiness β measures the aspiration noise in speech #* Brilliance β describes the dominance of high or low frequencies In the speech #* Loudness β measures the amplitude of the speech waveform, translates to the energy of an utterance #* Pause Discontinuity β describes the transitions between sound and silence #* Pitch Discontinuity β describes the transitions of the fundamental frequency.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)