Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Rabin–Karp algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|String searching algorithm}} {{no footnotes|date=September 2018}} {{Infobox algorithm |name = Rabin-Karp algorithm |class = [[string-searching algorithm|String searching]] |image = <!-- filename only, no "File:" or "Image:" prefix, and no enclosing [[brackets]] --> |caption = |data = |time = <math>O(mn)</math> plus <math>O(m)</math> preprocessing time |best-time = |average-time = <math>O(n)</math> |space = <math>O(1)</math> }} In [[computer science]], the '''Rabin–Karp algorithm''' or '''Karp–Rabin algorithm''' is a [[string-searching algorithm]] created by {{harvs|first1=Richard M.|last1=Karp|author1-link=Richard M. Karp|first2=Michael O.|last2=Rabin|author2-link=Michael O. Rabin|year=1987|txt}} that uses [[Hash function|hashing]] to find an exact match of a pattern string in a text. It uses a [[rolling hash]] to quickly filter out positions of the text that cannot match the pattern, and then checks for a match at the remaining positions. Generalizations of the same idea can be used to find more than one match of a single pattern, or to find matches for more than one pattern. To find a single match of a single pattern, the [[expected time]] of the algorithm is [[linear time|linear]] in the combined length of the pattern and text, although its [[Worst-case complexity|worst-case time complexity]] is the product of the two lengths. To find multiple matches, the expected time is linear in the input lengths, plus the combined length of all the matches, which could be greater than linear. In contrast, the [[Aho–Corasick algorithm]] can find all matches of multiple patterns in worst-case time and space linear in the input length and the number of matches (instead of the total length of the matches). A practical application of the algorithm is [[plagiarism detection|detecting plagiarism]]. Given source material, the algorithm can rapidly search through a paper for instances of sentences from the source material, ignoring details such as case and punctuation. Because of the abundance of the sought strings, single-string searching algorithms are impractical.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)