Editing Hypergeometric distribution (section)

===Tail bounds===
Let <math>X \sim \operatorname{Hypergeometric}(N,K,n)</math> and <math>p=K/N</math>. Then for <math> 0 < t < K/N</math> we can derive the following bounds:<ref name=":0">{{citation
 | last = Hoeffding | first = Wassily
 | journal = [[Journal of the American Statistical Association]]
 | volume= 58
 | number= 301
 | pages=13–30
 | title = Probability inequalities for sums of bounded random variables
 | year = 1963
 | doi=10.2307/2282952| jstor = 2282952
 | url = http://repository.lib.ncsu.edu/bitstream/1840.4/2170/1/ISMS_1962_326.pdf
 }}.</ref>

: <math>\begin{align}
\Pr[X\le (p - t)n]
&\le e^{-n\text{D}(p-t\parallel p)} \le e^{-2t^2n}\\
\Pr[X\ge (p+t)n]
&\le e^{-n\text{D}(p+t\parallel p)} \le e^{-2t^2n}\\
\end{align}\!</math>

where
: <math> D(a\parallel b)=a\log\frac{a}{b}+(1-a)\log\frac{1-a}{1-b}</math>

is the [[Kullback-Leibler divergence]] and it is used that <math>D(a\parallel b) \ge 2(a-b)^2</math>.<ref name="wordpress.com">{{cite web|url=https://ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail/|title=Another Tail of the Hypergeometric Distribution|date=8 December 2015|website=wordpress.com|access-date=19 March 2018}}</ref>

'''Note''': In order to derive the previous bounds, one has to start by observing that <math>X = \frac{\sum_{i=1}^n Y_i}{n}</math> where <math>Y_i</math> are ''dependent'' random variables with a specific distribution <math>D</math>. Because most of the theorems about bounds in sum of random variables are concerned with ''independent'' sequences of them, one has to first create a sequence <math>Z_i</math> of ''independent'' random variables with the same distribution <math>D</math> and apply the theorems on <math>X' = \frac{\sum_{i=1}^{n}Z_i}{n}</math>. Then, it is proved from Hoeffding <ref name=":0" /> that the results and bounds obtained via this process hold for <math>X</math> as well.

If ''n'' is larger than ''N''/2, it can be useful to apply symmetry to "invert" the bounds, which give you the following:
<ref name="wordpress.com"/>
<ref>{{citation
 | last = Serfling | first = Robert
 | journal = [[The Annals of Statistics]]
 | volume = 2
 | pages = 39–48
 | title = Probability inequalities for the sum in sampling without replacement
 | year = 1974| issue = 1
 | doi = 10.1214/aos/1176342611
 | doi-access = free
 }}.</ref>

: <math>\begin{align}
\Pr[X\le (p - t)n]
&\le e^{-(N-n)\text{D}(p+\tfrac{tn}{N-n}||p)} \le e^{-2 t^2 n \tfrac{n}{N-n}}\\
\\
\Pr[X\ge (p+t)n]
&\le e^{-(N-n)\text{D}(p-\tfrac{tn}{N-n}||p)} \le e^{-2 t^2 n \tfrac{n}{N-n}}\\
\end{align}\!</math>