Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Rabin–Karp algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==The algorithm== The algorithm is as shown: <syntaxhighlight lang="php" line highlight="7"> function RabinKarp(string s[1..n], string pattern[1..m]) hpattern := hash(pattern[1..m]); for i from 1 to n-m+1 hs := hash(s[i..i+m-1]) if hs = hpattern if s[i..i+m-1] = pattern[1..m] return i return not found </syntaxhighlight> Lines 2, 4, and 6 each require [[Big-O notation|O]](''m'') time. However, line 2 is only executed once, and line 6 is only executed if the hash values match, which is unlikely to happen more than a few times. Line 5 is executed O(''n'') times, but each comparison only requires constant time, so its impact is O(''n''). The issue is line 4. Naively computing the hash value for the substring <code>s[i+1..i+m]</code> requires O(''m'') time because each character is examined. Since the hash computation is done on each loop, the algorithm with a naive hash computation requires O(''mn'') time, the same complexity as a straightforward string matching algorithm. For speed, the hash must be computed in constant time. The trick is the variable <code>hs</code> already contains the previous hash value of <code>s[i..i+m-1]</code>. If that value can be used to compute the next hash value in constant time, then computing successive hash values will be fast. The trick can be exploited using a [[rolling hash]]. A rolling hash is a hash function specially designed to enable this operation. A trivial (but not very good) rolling hash function just adds the values of each character in the substring. This rolling hash formula can compute the next hash value from the previous value in constant time: <pre> s[i+1..i+m] = s[i..i+m-1] - s[i] + s[i+m] </pre> This simple function works, but will result in statement 5 being executed more often than other more sophisticated rolling hash functions such as those discussed in the next section. Good performance requires a good hashing function for the encountered data. If the hashing is poor (such as producing the same hash value for every input), then line 6 would be executed O(''n'') times (i.e. on every iteration of the loop). Because character-by-character comparison of strings with length ''m'' takes O(''m'') time, the whole algorithm then takes a worst-case O(''mn'') time.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)