Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Perfect hash function
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Extensions== ===Dynamic perfect hashing=== {{main article|Dynamic perfect hashing}} Using a perfect hash function is best in situations where there is a frequently queried large set, {{mvar|S}}, which is seldom updated. This is because any modification of the set {{mvar|S}} may cause the hash function to no longer be perfect for the modified set. Solutions which update the hash function any time the set is modified are known as [[dynamic perfect hashing]],<ref name="DynamicPerfectHashing">{{citation | last1 = Dietzfelbinger | first1 = Martin | last2 = Karlin | first2 = Anna | author2-link = Anna Karlin | last3 = Mehlhorn | first3 = Kurt | author3-link = Kurt Mehlhorn | last4 = Meyer auf der Heide | first4 = Friedhelm | last5 = Rohnert | first5 = Hans | last6 = Tarjan | first6 = Robert E. | author6-link = Robert Tarjan | doi = 10.1137/S0097539791194094 | issue = 4 | journal = [[SIAM Journal on Computing]] | mr = 1283572 | pages = 738β761 | title = Dynamic perfect hashing: upper and lower bounds | volume = 23 | year = 1994}}.</ref> but these methods are relatively complicated to implement. ===Minimal perfect hash function=== A minimal perfect hash function is a perfect hash function that maps {{mvar|n}} keys to {{mvar|n}} consecutive integers β usually the numbers from {{math|0}} to {{math|''n'' − 1}} or from {{math|1}} to {{mvar|n}}. A more formal way of expressing this is: Let {{mvar|j}} and {{mvar|k}} be elements of some finite set {{mvar|S}}. Then {{mvar|h}} is a minimal perfect hash function if and only if {{math|1=''h''(''j'') = ''h''(''k'')}} implies {{math|1=''j'' = ''k''}} ([[injectivity]]) and there exists an integer {{mvar|a}} such that the range of {{mvar|h}} is {{math|1=''a''..''a'' + {{!}}''S''{{!}} − 1}}. It has been proven that a general purpose minimal perfect hash scheme requires at least <math>\log_2 e \approx 1.44</math> bits/key.<ref name="CHD">{{citation | last1 = Belazzougui | first1 = Djamal | last2 = Botelho | first2 = Fabiano C. | last3 = Dietzfelbinger | first3 = Martin | contribution = Hash, displace, and compress | contribution-url = http://cmph.sourceforge.net/papers/esa09.pdf | doi = 10.1007/978-3-642-04128-0_61 | location = Berlin | mr = 2557794 | pages = 682β693 | publisher = Springer | series = [[Lecture Notes in Computer Science]] | title = Algorithms - ESA 2009 | volume = 5757 | isbn = 978-3-642-04127-3 | year = 2009| citeseerx = 10.1.1.568.130 | url = http://cmph.sourceforge.net/papers/esa09.pdf }}.</ref> Assuming that <math>S</math> is a set of size <math>n</math> containing integers in the range <math>[1, 2^{o(n)}]</math>, it is known how to efficiently construct an explicit minimal perfect hash function from <math>S</math> to <math>\{1, 2, \ldots, n\}</math> that uses space <math>n \log_2 e + o(n)</math>bits and that supports constant evaluation time.<ref>{{Citation |last1=Hagerup |first1=Torben |title=Efficient Minimal Perfect Hashing in Nearly Minimal Space |date=2001 |url=http://dx.doi.org/10.1007/3-540-44693-1_28 |work=STACS 2001 |pages=317β326 |access-date=2023-11-12 |place=Berlin, Heidelberg |publisher=Springer Berlin Heidelberg |isbn=978-3-540-41695-1 |last2=Tholey |first2=Torsten|doi=10.1007/3-540-44693-1_28 }}</ref> In practice, there are minimal perfect hashing schemes that use roughly 1.56 bits/key if given enough time.<ref name="RecSplit">{{citation | last1 = Esposito | first1 = Emmanuel | last2 = Mueller Graf | first2 = Thomas | last3 = Vigna | first3 = Sebastiano | contribution = RecSplit: Minimal Perfect Hashing via Recursive Splitting | doi = 10.1137/1.9781611976007.14 | pages = 175β185 | series = [[Proceedings]] | title = 2020 Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX) | year = 2020 | arxiv = 1910.06416 | doi-access = free }}.</ref> ===k-perfect hashing=== A hash function is {{mvar|k}}-perfect if at most {{mvar|k}} elements from {{mvar|S}} are mapped onto the same value in the range. The "hash, displace, and compress" algorithm can be used to construct {{mvar|k}}-perfect hash functions by allowing up to {{mvar|k}} collisions. The changes necessary to accomplish this are minimal, and are underlined in the adapted pseudocode below: (4) '''for all''' i{{thin space}}β[r], in the order from (2), '''do''' (5) '''for''' l{{thin space}}←{{thin space}}1,2,... (6) '''repeat''' forming K<sub>i</sub>{{thin space}}←{{thin space}}{{{math|Φ}}<sub>l</sub>(x)|x{{thin space}}β{{thin space}}B<sub>i</sub>} (6) '''until''' |K<sub>i</sub>|=|B<sub>i</sub>| '''and''' K<sub>i</sub>∩{j|<u>T[j]=k</u>}={{thin space}}∅ (7) '''let''' Ο(i):= the successful l (8) '''for all''' j{{thin space}}β{{thin space}}K<sub>i</sub> '''set''' <u>T[j]←T[j]+1</u> ===Order preservation=== A minimal perfect hash function {{mvar|F}} is ''order preserving'' if keys are given in some order {{math|''a''<sub>1</sub>, ''a''<sub>2</sub>, ..., ''a''<sub>''n''</sub>}} and for any keys {{math|''a''<sub>''j''</sub>}} and {{math|''a''<sub>''k''</sub>}}, {{math|''j'' < ''k''}} implies {{math|''F''(''a''<sub>''j''</sub>) < F(''a''<sub>''k''</sub>)}}.<ref>{{Citation |first=Bob |last=Jenkins |contribution=order-preserving minimal perfect hashing |title=Dictionary of Algorithms and Data Structures |editor-first=Paul E. |editor-last=Black |publisher=U.S. National Institute of Standards and Technology |date=14 April 2009 |accessdate=2013-03-05 |url=https://xlinux.nist.gov/dads/HTML/orderPreservMinPerfectHash.html}}</ref> In this case, the function value is just the position of each key in the sorted ordering of all of the keys. A simple implementation of order-preserving minimal perfect hash functions with constant access time is to use an (ordinary) perfect hash function to store a lookup table of the positions of each key. This solution uses <math>O(n \log n)</math> bits, which is optimal in the setting where the comparison function for the keys may be arbitrary.<ref>{{citation |last1=Fox |first1=Edward A. |title=Order-preserving minimal perfect hash functions and information retrieval |date=July 1991 |url=http://eprints.cs.vt.edu/archive/00000248/01/TR-91-01.pdf |journal=ACM Transactions on Information Systems |volume=9 |issue=3 |pages=281β308 |location=New York, NY, USA |publisher=ACM |doi=10.1145/125187.125200 |s2cid=53239140 |last2=Chen |first2=Qi Fan |last3=Daoud |first3=Amjad M. |last4=Heath |first4=Lenwood S.}}.</ref> However, if the keys {{math|''a''<sub>1</sub>, ''a''<sub>2</sub>, ..., ''a''<sub>''n''</sub>}} are integers drawn from a universe <math>\{1, 2, \ldots, U\}</math>, then it is possible to construct an order-preserving hash function using only <math>O(n \log \log \log U)</math> bits of space.<ref>{{citation |last1=Belazzougui |first1=Djamal |title=Theory and practice of monotone minimal perfect hashing |date=November 2008 |journal=Journal of Experimental Algorithmics |volume=16 |at=Art. no. 3.2, 26pp |doi=10.1145/1963190.2025378 |s2cid=2367401 |last2=Boldi |first2=Paolo |last3=Pagh |first3=Rasmus |last4=Vigna |first4=Sebastiano |author3-link=Rasmus Pagh}}.</ref> Moreover, this bound is known to be optimal.<ref>{{Citation |last1=Assadi |first1=Sepehr |title=Tight Bounds for Monotone Minimal Perfect Hashing |date=January 2023 |url=http://dx.doi.org/10.1137/1.9781611977554.ch20 |work=Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) |pages=456β476 |access-date=2023-04-27 |place=Philadelphia, PA |publisher=Society for Industrial and Applied Mathematics |isbn=978-1-61197-755-4 |last2=Farach-Colton |first2=MartΓn |last3=Kuszmaul |first3=William|doi=10.1137/1.9781611977554.ch20 |arxiv=2207.10556 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)