Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Birthday problem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Approximations== [[Image:Birthday paradox probability.svg|thumb|right|upright=1.4|Graphs showing the approximate probabilities of at least two people sharing a birthday ({{color|red|red}}) and its complementary event ({{color|blue|blue}})]] [[Image:Birthday paradox approximation.svg|thumb|right|upright=1.4|A graph showing the accuracy of the approximation 1 β ''e''<sup>β''n''<sup>2</sup>/730</sup> ({{color|red|red}})]] The [[Taylor series]] expansion of the [[exponential function]] (the constant {{math|''e'' β {{val|2.718281828}}}}) :<math> e^x = 1 + x + \frac{x^2}{2!}+\cdots </math> provides a first-order approximation for {{math|''e''<sup>''x''</sup>}} for <math>|x| \ll 1</math>: :<math> e^x \approx 1 + x.</math> To apply this approximation to the first expression derived for {{math|''{{overline|p}}''(''n'')}}, set {{math|''x'' {{=}} β{{sfrac|''a''|365}}}}. Thus, :<math> e^{-a/365} \approx 1 - \frac{a}{365}. </math> Then, replace {{mvar|a}} with non-negative integers for each term in the formula of {{math|''{{overline|p}}''(''n'')}} until {{math|''a'' {{=}} ''n'' β 1}}, for example, when {{math|''a'' {{=}} 1}}, : <math> e^{-1/365} \approx 1 - \frac{1}{365}. </math> The first expression derived for {{math|''{{overline|p}}''(''n'')}} can be approximated as :<math> \begin{align} \bar p(n) & \approx 1 \cdot e^{-1/365} \cdot e^{-2/365} \cdots e^{-(n-1)/365} \\[6pt] & = e^{-\big(1+2+ \,\cdots\, +(n-1)\big)/365} \\[6pt] & = e^{-\frac{n(n-1)/2}{365}} = e^{-\frac{n(n-1)}{730}}. \end{align} </math> Therefore, :<math> p(n) = 1-\bar p(n) \approx 1 - e^{-\frac{n(n-1)}{730}}.</math> An even coarser approximation is given by :<math>p(n)\approx 1-e^{-\frac{n^2}{730}},</math> which, as the graph illustrates, is still fairly accurate. According to the approximation, the same approach can be applied to any number of "people" and "days". If rather than 365 days there are {{mvar|d}}, if there are {{mvar|n}} persons, and if {{math|''n'' βͺ ''d''}}, then using the same approach as above we achieve the result that if {{math|''p''(''n'', ''d'')}} is the probability that at least two out of {{mvar|n}} people share the same birthday from a set of {{mvar|d}} available days, then: :<math>\begin{align} p(n, d) & \approx 1-e^{-\frac{n(n-1)}{2d}} \\[6pt] & \approx 1-e^{-\frac{n^2}{2d}}. \end{align}</math> ===Simple exponentiation=== The probability of any two people not having the same birthday is {{sfrac|364|365}}. In a room containing ''n'' people, there are {{math|{{pars|s=150%|{{su|p=''n''|b=2|a=c}}}} {{=}} {{sfrac|''n''(''n'' β 1)|2}}}} pairs of people, i.e. {{math|{{pars|s=150%|{{su|p=''n''|b=2|a=c}}}}}} events. The probability of no two people sharing the same birthday can be approximated by assuming that these events are independent and hence by multiplying their probability together. Being independent would be equivalent to picking [[Sampling (statistics)#Replacement of selected units|with replacement]], any pair of people in the world, not just in a room. In short {{sfrac|364|365}} can be multiplied by itself {{math|{{pars|s=150%|{{su|p=''n''|b=2|a=c}}}}}} times, which gives us :<math>\bar p(n) \approx \left(\frac{364}{365}\right)^\binom{n}{2}.</math> Since this is the probability of no one having the same birthday, then the probability of someone sharing a birthday is :<math>p(n) \approx 1 - \left(\frac{364}{365}\right)^\binom{n}{2}.</math> And for the group of 23 people, the probability of sharing is :<math>p(23) \approx 1 - \left(\frac{364}{365}\right)^\binom{23}{2} = 1 - \left(\frac{364}{365}\right)^{253} \approx 0.500477 .</math> ===Poisson approximation=== Applying the [[Poisson distribution|Poisson]] approximation for the binomial on the group of 23 people, :<math>\operatorname{Poi}\left(\frac{\binom{23}{2}}{365}\right) =\operatorname{Poi}\left(\frac{253}{365}\right) \approx \operatorname{Poi}(0.6932)</math> so :<math>\Pr(X>0)=1-\Pr(X=0) \approx 1-e^{-0.6932} \approx 1-0.499998=0.500002.</math> The result is over 50% as previous descriptions. This approximation is the same as the one above based on the Taylor expansion that uses {{math|''e<sup>x</sup>'' β 1 + ''x''}}. ===Square approximation=== A good [[rule of thumb]] which can be used for [[mental calculation]] is the relation :<math>p(n,d) \approx \frac{n^2}{2d}</math> which can also be written as :<math>n \approx \sqrt { 2d \times p(n)}</math> which works well for probabilities less than or equal to {{sfrac|1|2}}. In these equations, {{mvar|d}} is the number of days in a year. For instance, to estimate the number of people required for a {{sfrac|1|2}} chance of a shared birthday, we get :<math>n \approx \sqrt{ 2 \times 365 \times \tfrac12} = \sqrt{365} \approx 19</math> Which is not too far from the correct answer of 23. ===Approximation of number of people=== This can also be approximated using the following formula for the ''number'' of people necessary to have at least a {{sfrac|1|2}} chance of matching: :<math>n \geq \tfrac{1}{2} + \sqrt{\tfrac{1}{4} + 2 \times \ln(2) \times 365} = 22.999943.</math> This is a result of the good approximation that an event with {{math|{{sfrac|1|''k''}}}} probability will have a {{sfrac|1|2}} chance of occurring at least once if it is repeated {{math|''k'' [[natural logarithm of 2|ln 2]]}} times.<ref>{{cite journal | last = Mathis | first = Frank H. |date= June 1991 | title = A Generalized Birthday Problem | journal = SIAM Review | volume = 33 | issue = 2 | pages = 265β270 | issn = 0036-1445 | doi = 10.1137/1033051 | oclc = 37699182 | jstor = 2031144 | url = http://http.cs.berkeley.edu/~daw/papers/genbday-crypto02.ps }}</ref> ===Probability table=== {{Main|Birthday attack}} :{| class="wikitable" style="white-space:nowrap;" |- ! rowspan="2" | length of <br />hex string ! rowspan="2" | no. of<br />bits<br />({{mvar|b}}) ! rowspan="2" | hash space<br />size<br />({{math|2<sup>''b''</sup>}}) ! colspan="10" | Number of hashed elements such that probability of at least one hash collision β₯ {{mvar|p}} |- ! {{mvar|p}} = {{val||e=-18}} ! {{mvar|p}} = {{val||e=-15}} ! {{mvar|p}} = {{val||e=-12}} ! {{mvar|p}} = {{val||e=-9}} ! {{mvar|p}} = {{val||e=-6}} ! {{mvar|p}} = 0.001 ! {{mvar|p}} = 0.01 ! {{mvar|p}} = 0.25 ! {{mvar|p}} = 0.50 ! {{mvar|p}} = 0.75 |- align="center" | bgcolor="#F2F2F2" | 8 | bgcolor="#F2F2F2" | 32 | bgcolor="#F2F2F2" | {{val|4.3|e=9}} | 2 | 2 | 2 | 2.9 | 93 | {{val|2.9|e=3}} | {{val|9.3|e=3}} | {{val|5.0|e=4}} | {{val|7.7|e=4}} | {{val|1.1|e=5}} |- align="center" | bgcolor="#F2F2F2" | (10) | bgcolor="#F2F2F2" | (40) | bgcolor="#F2F2F2" | ({{val|1.1|e=12}}) | 2 | 2 | 2 | 47 | {{val|1.5|e=3}} | {{val|4.7|e=4}} | {{val|1.5|e=5}} | {{val|8.0|e=5}} | {{val|1.2|e=6}} | {{val|1.7|e=6}} |- align="center" | bgcolor="#F2F2F2" | (12) | bgcolor="#F2F2F2" | (48) | bgcolor="#F2F2F2" | ({{val|2.8|e=14}}) | 2 | 2 | 24 | {{val|7.5|e=2}} | {{val|2.4|e=4}} | {{val|7.5|e=5}} | {{val|2.4|e=6}} | {{val|1.3|e=7}} | {{val|2.0|e=7}} | {{val|2.8|e=7}} |- align="center" | bgcolor="#F2F2F2" | 16 | bgcolor="#F2F2F2" | 64 | bgcolor="#F2F2F2" | {{val|1.8|e=19}} | 6.1 | {{val|1.9|e=2}} | {{val|6.1|e=3}} | {{val|1.9|e=5}} | {{val|6.1|e=6}} | {{val|1.9|e=8}} | {{val|6.1|e=8}} | {{val|3.3|e=9}} | {{val|5.1|e=9}} | {{val|7.2|e=9}} |- align="center" | bgcolor="#F2F2F2" | (24) | bgcolor="#F2F2F2" | (96) | bgcolor="#F2F2F2" | ({{val|7.9|e=28}}) | {{val|4.0|e=5}} | {{val|1.3|e=7}} | {{val|4.0|e=8}} | {{val|1.3|e=10}} | {{val|4.0|e=11}} | {{val|1.3|e=13}} | {{val|4.0|e=13}} | {{val|2.1|e=14}} | {{val|3.3|e=14}} | {{val|4.7|e=14}} |- align="center" | bgcolor="#F2F2F2" | 32 | bgcolor="#F2F2F2" | 128 | bgcolor="#F2F2F2" | {{val|3.4|e=38}} | {{val|2.6|e=10}} | {{val|8.2|e=11}} | {{val|2.6|e=13}} | {{val|8.2|e=14}} | {{val|2.6|e=16}} | {{val|8.3|e=17}} | {{val|2.6|e=18}} | {{val|1.4|e=19}} | {{val|2.2|e=19}} | {{val|3.1|e=19}} |- align="center" | bgcolor="#F2F2F2" | (48) | bgcolor="#F2F2F2" | (192) | bgcolor="#F2F2F2" | ({{val|6.3|e=57}}) | {{val|1.1|e=20}} | {{val|3.5|e=21}} | {{val|1.1|e=23}} | {{val|3.5|e=24}} | {{val|1.1|e=26}} | {{val|3.5|e=27}} | {{val|1.1|e=28}} | {{val|6.0|e=28}} | {{val|9.3|e=28}} | {{val|1.3|e=29}} |- align="center" | bgcolor="#F2F2F2" | 64 | bgcolor="#F2F2F2" | 256 | bgcolor="#F2F2F2" | {{val|1.2|e=77}} | {{val|4.8|e=29}} | {{val|1.5|e=31}} | {{val|4.8|e=32}} | {{val|1.5|e=34}} | {{val|4.8|e=35}} | {{val|1.5|e=37}} | {{val|4.8|e=37}} | {{val|2.6|e=38}} | {{val|4.0|e=38}} | {{val|5.7|e=38}} |- align="center" | bgcolor="#F2F2F2" | (96) | bgcolor="#F2F2F2" | (384) | bgcolor="#F2F2F2" | ({{val|3.9|e=115}}) | {{val|8.9|e=48}} | {{val|2.8|e=50}} | {{val|8.9|e=51}} | {{val|2.8|e=53}} | {{val|8.9|e=54}} | {{val|2.8|e=56}} | {{val|8.9|e=56}} | {{val|4.8|e=57}} | {{val|7.4|e=57}} | {{val|1.0|e=58}} |- align="center" | bgcolor="#F2F2F2" | 128 | bgcolor="#F2F2F2" | 512 | bgcolor="#F2F2F2" | {{val|1.3|e=154}} | {{val|1.6|e=68}} | {{val|5.2|e=69}} | {{val|1.6|e=71}} | {{val|5.2|e=72}} | {{val|1.6|e=74}} | {{val|5.2|e=75}} | {{val|1.6|e=76}} | {{val|8.8|e=76}} | {{val|1.4|e=77}} | {{val|1.9|e=77}} |} [[File:birthday_attack_vs_paradox.svg|thumb|Comparison of the birthday problem (1) and birthday attack (2):{{parabreak}} In (1), collisions are found within one set, in this case, 3 out of 276 pairings of the 24 lunar astronauts.{{parabreak}} In (2), collisions are found between two sets, in this case, 1 out of 256 pairings of only the first bytes of SHA-256 hashes of 16 variants each of benign and harmful contracts.]] The lighter fields in this table show the number of hashes needed to achieve the given probability of collision (column) given a hash space of a certain size in bits (row). Using the birthday analogy: the "hash space size" resembles the "available days", the "probability of collision" resembles the "probability of shared birthday", and the "required number of hashed elements" resembles the "required number of people in a group". One could also use this chart to determine the minimum hash size required (given upper bounds on the hashes and probability of error), or the probability of collision (for fixed number of hashes and probability of error). For comparison, {{val|e=-18}} to {{val|e=-15}} is the uncorrectable [[bit error rate]] of a typical hard disk.<ref>Jim Gray, Catharine van Ingen. [https://arxiv.org/abs/cs/0701166 Empirical Measurements of Disk Failure Rates and Error Rates]</ref> In theory, 128-bit hash functions, such as [[MD5]], should stay within that range until about {{val|8.2|e=11}} documents, even if its possible outputs are many more.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)