Editing Spell checker (section)

==Design==
A basic spell checker carries out the following processes:
* It scans the text and extracts the words contained in it.
* It then compares each word with a known list of correctly spelled words (i.e. a dictionary). This might contain just a list of words, or it might also contain additional information, such as hyphenation points or lexical and grammatical attributes.
* An additional step is a language-dependent algorithm for handling [[morphology (linguistics)|morphology]]. Even for a lightly inflected language like [[English language|English]], the spell checker will need to consider different forms of the same word, such as plurals, verbal forms, [[contraction (grammar)|contraction]]s, and [[possessive (linguistics)|possessive]]s. For many other languages, such as those featuring agglutination and more complex declension and conjugation, this part of the process is more complicated.

It is unclear whether morphological analysis&mdash;allowing for many forms of a word depending on its grammatical role&mdash;provides a significant benefit for English, though its benefits for highly [[synthetic language]]s such as German, Hungarian, or Turkish are clear.

As an adjunct to these components, the program's [[user interface]] allows users to approve or reject replacements and modify the program's operation.

Spell checkers can use [[approximate string matching]] algorithms such as [[Levenshtein distance]] to find correct spellings of misspelled words.<ref>{{Cite book|last=Perner|first=Petra|url=https://books.google.com/books?id=wnXJfsCGQC8C&q=%22spell+checking%22|title=Advances in Data Mining: Applications and Theoretical Aspects: 10th Industrial Conference, ICDM 2010, Berlin, Germany, July 12-14, 2010. Proceedings|date=2010-07-05|publisher=Springer Science & Business Media|isbn=978-3-642-14399-1|language=en}}</ref> An alternative type of spell checker uses solely statistical information, such as [[n-gram]]s, to recognize errors instead of correctly-spelled words. This approach usually requires a lot of effort to obtain sufficient statistical information. Key advantages include needing less runtime storage and the ability to correct errors in words that are not included in a dictionary.<ref>U.S. Patent 6618697, [https://patentimages.storage.googleapis.com/84/a6/5f/6b58b2e2c2da12/US6618697.pdf Method for rule-based correction of spelling and grammar errors]</ref> <!-- The cited U.S. patent was used to implement a dictionary-less spelling correction algorithm for Graffiti on the Palm Pilot in only about 8k of memory. Another reference is "What Makes a Great Invention?", Wall Street Journal, 10/23/2003, https://www.wsj.com/articles/SB106684550065867100. An n-gram based algorithm is also included in Solr: http://lucidworks.com/blog/getting-started-spell-checking-with-apache-lucene-and-solr/.-->

In some cases, spell checkers use a fixed list of misspellings and [[spelling suggestion|suggestions]] for those misspellings; this less flexible approach is often used in paper-based correction methods, such as the ''see also'' entries of encyclopedias.

[[Clustering algorithm]]s have also been used for spell checking<ref>de Amorim, R.C.; Zampieri, M. (2013) [http://anthology.aclweb.org/R/R13/R13-1.pdf#page=200 Effective Spell Checking Methods Using Clustering Algorithms.] {{Webarchive|url=https://web.archive.org/web/20170817162117/http://anthology.aclweb.org/R/R13/R13-1.pdf#page=200 |date=2017-08-17 }} Proceedings of Recent Advances in Natural Language Processing (RANLP2013). Hissar, Bulgaria. p. 172-178.</ref> combined with phonetic information.<ref>Zampieri, M.; de Amorim, R.C. (2014) [https://www.researchgate.net/profile/Renato_Amorim/publication/262603118_Between_Sound_and_Spelling_Combining_Phonetics_and_Clustering_Algorithms_to_Improve_Target_Word_Recovery/links/0a85e53cd2485a27fb000000.pdf Between Sound and Spelling: Combining Phonetics and Clustering Algorithms to Improve Target Word Recovery.] Proceedings of the 9th International Conference on Natural Language Processing (PolTAL). Lecture Notes in Computer Science (LNCS). Springer. p. 438-449.</ref>