Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Keyword spotting
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
'''Keyword spotting''' (or more simply, '''word spotting''') is a problem that was historically first defined in the context of [[speech processing]].<ref name="giotis17">{{cite journal|last1=Giotis|first1=A.P|last2=Sfikas|first2=G.|last3=Gatos|first3=B.|last4=Nikou|first4=C.|title=A survey of document image word spotting techniques|journal=Pattern Recognition|date=2017|volume=68|pages=310β332|ref=1|doi=10.1016/j.patcog.2017.02.023|bibcode=2017PatRe..68..310G }}</ref><ref name="rohlicek89" /> In speech processing, keyword spotting deals with the identification of [[Keyword (linguistics)|keywords]] in [[Utterance|utterances]]. Keyword spotting is also defined as a separate, but related, problem in the context of document image processing.<ref name="giotis17" /> In document image processing, keyword spotting is the problem of finding all instances of a query word that exist in a scanned document image, without fully recognizing it. ==In speech processing== The first works in keyword spotting appeared in the late 1980s.<ref name="rohlicek89">{{cite journal|last1=Rohlicek|first1=J.|last2=Russell|first2=W.|last3=Roukos|first3=S.|last4=Gish|first4=H.|title=Continuous hidden Markov modeling for speaker-independent word spotting|journal=Proceedings of the 14th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)|date=1989|volume=1|pages=627β630|ref=rohlicek89}}</ref> A special case of keyword spotting is wake word (also called hot word) detection used by personal digital assistants such as [[Amazon Alexa|Alexa]] or [[Apple Siri|Siri]] to activate the dormant speaker, in other words "wake up" when their name is spoken. In the United States, the [[National Security Agency]] has made use of keyword spotting since at least 2006.<ref>{{cite web|last1=Froomkin|first1=Dan|title=THE COMPUTERS ARE LISTENING|url=https://firstlook.org/theintercept/2015/05/05/nsa-speech-recognition-snowden-searchable-text/|website=The Intercept|date=5 May 2015 |accessdate=20 June 2015}}</ref> This technology allows analysts to search through large volumes of recorded conversations and isolate mentions of suspicious keywords. Recordings can be indexed and analysts can run queries over the database to find conversations of interest. [[IARPA]] funded research into keyword spotting in the [[Babel program]]. Some algorithms used for this task are: * [[Sliding window]] and [[garbage model]] * [[K-best hypothesis]] * [[Iterative Viterbi decoding]] * [[Convolutional neural network]] on [[Mel-frequency cepstrum]] coefficients<ref>{{Cite journal | author1 = Sainath, Tara N | author1-link = Tara Sainath | author2 = Parada, Carolina | journal = Sixteenth Annual Conference of the International Speech Communication Association | title = Convolutional neural networks for small-footprint keyword spotting | year = 2015 | arxiv = 1711.00333 }}</ref> * [[Transformer_(machine_learning_model)|Transformer]]-based small-footprint keyword spotting<ref>{{Cite conference | url = https://www.isca-speech.org/archive/pdfs/interspeech_2021/wei21_interspeech.pdf | title = End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention | last1 = Wei | first1 = Bo | last2 = Yang | first2 = Meirong | last3 = Zhang | first3 = Tao | last4 = Tang | first4 = Xiao | last5 = Huang | first5 = Xing | last6 = Kim | first6 = Kyuhong | last7 = Lee | first7 = Jaeyun | last8 = Cho | first8 = Kiho | last9 = Park | first9 = Sung-Un | date = 30 August 2021 | year = 2021 | conference = Interspeech 2021 }}</ref> ==In document image processing== Keyword spotting in document image processing can be seen as an instance of the more generic problem of [[content-based image retrieval]] (CBIR). Given a query, the goal is to retrieve the most relevant instances of words in a collection of scanned documents.<ref name="giotis17" /> The query may be a text string (query-by-string keyword spotting) or a word image (query-by-example keyword spotting). ==References== {{reflist}} [[Category:Pattern recognition]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite conference
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Reflist
(
edit
)