Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Birthday problem
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Probability of a shared birthday (collision)=== The birthday problem can be generalized as follows: :Given {{mvar|n}} random integers drawn from a [[Uniform distribution (discrete)|discrete uniform distribution]] with range {{math|[1,''d'']}}, what is the probability {{math|''p''(''n''; ''d'')}} that at least two numbers are the same? ({{math|''d'' {{=}} 365}} gives the usual birthday problem.)<ref>{{cite conference | title = Birthday Paradox for Multi-collisions| last1 = Suzuki| first1 = K. | last2 = Tonien| first2 = D.|display-authors=et al| date = 2006| publisher = Springer| book-title = Lecture Notes in Computer Science, vol 4296 | location = Berlin | id = Information Security and Cryptology β ICISC 2006| editor = Rhee M.S., Lee B. | doi = 10.1007/11927587_5}}</ref> The generic results can be derived using the same arguments given above. :<math>\begin{align} p(n;d) &= \begin{cases} 1-\displaystyle\prod_{k=1}^{n-1}\left(1-\frac{k}{d}\right) & n\le d \\ 1 & n > d \end{cases} \\[8px] & \approx 1 - e^{-\frac{n(n-1)}{2d}} \\ & \approx 1 - \left( \frac{d-1}{d} \right)^\frac{n(n-1)}{2} \end{align}</math> Conversely, if {{math|''n''(''p''; ''d'')}} denotes the number of random integers drawn from {{math|[1,''d'']}} to obtain a probability {{mvar|p}} that at least two numbers are the same, then :<math>n(p;d)\approx \sqrt{2d \cdot \ln\left(\frac{1}{1-p}\right)}.</math> The birthday problem in this more generic sense applies to [[hash function]]s: the expected number of {{math|''N''}}-[[bit]] hashes that can be generated before getting a collision is not {{math|2<sup>''N''</sup>}}, but rather only {{math|2<sup>{{frac|''N''|2}}</sup>}}. This is exploited by [[birthday attack]]s on [[cryptographic hash function]]s and is the reason why a small number of collisions in a [[hash table]] are, for all practical purposes, inevitable. The theory behind the birthday problem was used by Zoe Schnabel<ref>Z. E. Schnabel (1938) ''The Estimation of the Total Fish Population of a Lake'', [[American Mathematical Monthly]] '''45''', 348β352.</ref> under the name of [[mark and recapture|capture-recapture]] statistics to estimate the size of fish population in lakes. The birthday problem and its generalizations are also useful tools for modelling coincidences.<ref name="Pollanen">M. Pollanen (2024) ''A Double Birthday Paradox in the Study of Coincidences'', [[Mathematics]] '''23'''(24), 3882. https://doi.org/10.3390/math12243882</ref> ====Probability of a unique collision==== The classic birthday problem allows for more than two people to share a particular birthday or for there to be matches on multiple days. The probability that among {{mvar|n}} people there is exactly one pair of individuals with a matching birthday given {{mvar|d}} possible days is<ref name="Pollanen"/> : <math> p_2(n; d) = \frac{{n \choose 2}}{d-n+1} (1-p(n; d)) </math> Unlike the standard birthday problem, as {{mvar|n}} increases the probability reaches a maximum value before decreasing. For example, for {{math|''d'' {{=}} 365}}, the probability of a unique match has a maximum value of 0.3864 occurring when {{math|''n'' {{=}} 28}}. ====Generalization to multiple types of people==== [[File:2d birthday.png|thumb|Plot of the probability of at least one shared birthday between at least one man and one woman]] The basic problem considers all trials to be of one "type". The birthday problem has been generalized to consider an arbitrary number of types.<ref>[[Michael Christopher Wendl|M. C. Wendl]] (2003) ''[https://dx.doi.org/10.1016/S0167-7152(03)00168-8 Collision Probability Between Sets of Random Variables]'', Statistics and Probability Letters '''64'''(3), 249β254.</ref> In the simplest extension there are two types of people, say {{mvar|m}} men and {{mvar|n}} women, and the problem becomes characterizing the probability of a shared birthday between at least one man and one woman. (Shared birthdays between two men or two women do not count.) The probability of no shared birthdays here is :<math>p_0 =\frac{1}{d^{m+n}} \sum_{i=1}^m \sum_{j=1}^n S_2(m,i) S_2(n,j) \prod_{k=0}^{i+j-1} d - k</math> where {{math|''d'' {{=}} 365}} and {{math|''S''<sub>2</sub>}} are [[Stirling numbers of the second kind]]. Consequently, the desired probability is {{math|1 β ''p''<sub>0</sub>}}. This variation of the birthday problem is interesting because there is not a unique solution for the total number of people {{math|''m'' + ''n''}}. For example, the usual 50% probability value is realized for both a 32-member group of 16 men and 16 women and a 49-member group of 43 women and 6 men.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)