Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Hash collision
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Background == Hash collisions can be unavoidable depending on the number of objects in a set and whether or not the bit string they are mapped to is long enough in length. When there is a set of ''n'' objects, if ''n'' is greater than |''R''|, which in this case ''R'' is the range of the hash value, the probability that there will be a hash collision is 1, meaning it is guaranteed to occur.<ref name=":0">{{Cite book|date=2016|title=Cybersecurity and Applied Mathematics|url=http://dx.doi.org/10.1016/c2015-0-01807-x|doi=10.1016/c2015-0-01807-x|isbn=9780128044520}}</ref> Another reason hash collisions are likely at some point in time stems from the idea of the [[birthday problem|birthday paradox]] in mathematics. This problem looks at the probability of a set of two randomly chosen people having the same birthday out of ''n'' number of people.<ref>{{Cite book|last=Soltanian |first=Mohammad Reza Khalifeh |url=http://worldcat.org/oclc/1162249290|title=Theoretical and Experimental Methods for Defending Against DDoS Attacks|date=10 November 2015|isbn=978-0-12-805399-7|oclc=1162249290}}</ref> This idea has led to what has been called the [[birthday attack]]. The premise of this attack is that it is difficult to find a birthday that specifically matches your birthday or a specific birthday, but the probability of finding a set of ''any'' two people with matching birthdays increases the probability greatly. Bad actors can use this approach to make it simpler for them to find hash values that collide with any other hash value β rather than searching for a specific value.<ref>{{Citation|last1=Conrad|first1=Eric|title=Domain 3: Security Engineering (Engineering and Management of Security)|date=2016|url=http://dx.doi.org/10.1016/b978-0-12-802437-9.00004-7|work=CISSP Study Guide|pages=103β217|publisher=Elsevier|access-date=2021-12-08|last2=Misenar|first2=Seth|last3=Feldman|first3=Joshua|doi=10.1016/b978-0-12-802437-9.00004-7|isbn=9780128024379}}</ref> The impact of collisions depends on the application. When hash functions and fingerprints are used to identify similar data, such as [[homology (biology)|homologous]] [[DNA]] sequences or similar audio files, the functions are designed so as to ''maximize'' the probability of collision between distinct but similar data, using techniques like [[locality-sensitive hashing]].<ref name="MOMD">{{cite web|last1=Rajaraman|first1=A.|last2=Ullman|first2=J.|author2-link=Jeffrey Ullman|year=2010|title=Mining of Massive Datasets, Ch. 3.|url=http://infolab.stanford.edu/~ullman/mmds.html}}</ref> [[Checksum]]s, on the other hand, are designed to minimize the probability of collisions between similar inputs, without regard for collisions between very different inputs.<ref name="crypto">{{Cite conference|last1=Al-Kuwari|first1=Saif|last2=Davenport|first2=James H.|last3=Bradford|first3=Russell J.|date=2011|title=Cryptographic Hash Functions: Recent Design Trends and Security Notions|url=https://eprint.iacr.org/2011/565|conference=Inscrypt '10}}</ref> Instances where bad actors attempt to create or find hash collisions are known as [[Collision attack|collision attacks.]]<ref>{{Cite book|last=Schema|first=Mike|title=Hacking Web Apps|year=2012}}</ref> In practice, security-related applications use cryptographic hash algorithms, which are designed to be long enough for random matches to be unlikely, fast enough that they can be used anywhere, and safe enough that it would be extremely hard to find collisions.<ref name="crypto" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)