Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Hypergeometric distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Working example=== The classical application of the hypergeometric distribution is '''sampling without replacement'''. Think of an [[urn problem|urn]] with two colors of [[marbles]], red and green. Define drawing a green marble as a success and drawing a red marble as a failure. Let ''N'' describe the number of '''all marbles in the urn''' (see contingency table below) and ''K'' describe the number of '''green marbles''', then ''N'' β ''K'' corresponds to the number of '''red marbles'''. Now, standing next to the urn, you close your eyes and draw n marbles without replacement. Define ''X'' as a [[random variable]] whose outcome is ''k'', the number of green marbles drawn in the experiment. This situation is illustrated by the following [[contingency table]]: <!-- Formatting problem: tables overlap in Firefox with low resolution unless aligned by right. Please keep align=right! {| class="wikitable" style="float:right; margin-left:1em;" |- ! ! drawn ! not drawn ! total |- | align="right" | '''defective''' | align="right" | ''k'' | align="right" | ''K'' β ''k'' | align="right" | ''K'' |- | align="right" | '''non-defective''' | align="right" | ''n'' β ''k'' | align="right" | ''N β K β n + k'' | align="right" | ''N β K'' |- | align="right" | '''total''' td align="right">''n'' | align="right" | ''N β n'' | align="right" | ''N'' |} {{Clearright}}--> {| class="wikitable" style="text-align:center" ! || drawn || not drawn || total |- | align="right" | '''green marbles''' || ''k'' || ''K'' β ''k'' || ''K'' |- | align="right" | '''red marbles''' || ''n'' β ''k'' || ''N + k β n β K'' || ''N β K'' |- | align="right" | '''total''' || ''n'' || ''N β n'' || ''N'' |- |} Indeed, we are interested in calculating the probability of drawing k green marbles in n draws, given that there are K green marbles out of a total of N marbles. For this example, assume that there are '''5''' green and '''45''' red marbles in the urn. Standing next to the urn, you close your eyes and draw '''10''' marbles without replacement. What is the probability that exactly '''4''' of the '''10''' are green? This problem is summarized by the following contingency table: {| class="wikitable" style="text-align:center" |- ! !! drawn !! not drawn !! total |- | align="right" | '''green marbles''' | ''k'' = '''4''' | ''K'' β ''k'' = '''1''' | ''K'' = '''5''' |- | align="right" | '''red marbles''' | ''n'' β ''k'' = '''6''' | ''N + k β n β K'' = '''39''' | ''N β K'' = '''45''' |- | align="right" | '''total''' | ''n'' = '''10''' | ''N β n'' = '''40''' | ''N'' = '''50''' |} To find the probability of '''drawing k green marbles in exactly n draws out of N total draws''', we identify X as a hyper-geometric random variable to use the formula <math> P(X=k) = f(k;N,K,n) = {{{K \choose k} {{N-K} \choose {n-k}}}\over {N \choose n}}.</math> To intuitively explain the given formula, consider the two symmetric problems represented by the identity <math> {{K \choose k} {N-K \choose n-k}\over {N \choose n}} = {{{n \choose k} {{N-n} \choose {K-k}}} \over {N \choose K}}</math> # left-hand side - drawing a total of only n marbles out of the urn. We want to find the probability of the outcome of '''drawing k green marbles out of K total green marbles, and drawing n-k red marbles out of N-K red marbles, in these n rounds.''' # right hand side - alternatively, drawing all N marbles out of the urn. We want to find the probability of the outcome of drawing '''k green marbles in n draws out of the total N draws, and K-k green marbles in the rest N-n draws.''' Back to the calculations, we use the formula above to calculate the probability of drawing exactly ''k'' green marbles :<math> P(X=4) = f(4;50,5,10) = {{{5 \choose 4} {{45} \choose {6}}}\over {50 \choose 10}} = {5\cdot 8145060\over 10272278170} = 0.003964583\dots. </math> Intuitively we would expect it to be even more unlikely that all 5 green marbles will be among the 10 drawn. :<math> P(X=5) = f(5;50,5,10) = {{{5 \choose 5} {{45} \choose {5}}}\over {50 \choose 10}} = {1\cdot 1221759 \over 10272278170} = 0.0001189375\dots, </math> As expected, the probability of drawing 5 green marbles is roughly 35 times less likely than that of drawing 4.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)