Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Maximum likelihood estimation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Non-independent variables == It may be the case that variables are correlated, or more generally, not independent. Two random variables <math>y_1</math> and <math>y_2</math> are independent only if their joint probability density function is the product of the individual probability density functions, i.e. <math display="block">f(y_1,y_2) = f(y_1) f(y_2)\,</math> Suppose one constructs an order-''n'' Gaussian vector out of random variables <math>(y_1,\ldots,y_n)</math>, where each variable has means given by <math>(\mu_1, \ldots, \mu_n)</math>. Furthermore, let the [[covariance matrix]] be denoted by <math>\mathit\Sigma</math>. The joint probability density function of these ''n'' random variables then follows a [[multivariate normal distribution]] given by: <math display="block">f(y_1,\ldots,y_n) = \frac{1}{(2\pi)^{n/2}\sqrt{\det(\mathit\Sigma)}} \exp\left( -\frac{1}{2} \left[y_1-\mu_1,\ldots,y_n-\mu_n\right]\mathit\Sigma^{-1} \left[y_1-\mu_1,\ldots,y_n-\mu_n\right]^\mathrm{T} \right)</math> In the [[Bivariate analysis|bivariate]] case, the joint probability density function is given by: <math display="block"> f(y_1,y_2) = \frac{1}{2\pi \sigma_{1} \sigma_2 \sqrt{1-\rho^2}} \exp\left[ -\frac{1}{2(1-\rho^2)} \left(\frac{(y_1-\mu_1)^2}{\sigma_1^2} - \frac{2\rho(y_1-\mu_1)(y_2-\mu_2)}{\sigma_1\sigma_2} + \frac{(y_2-\mu_2)^2}{\sigma_2^2}\right) \right] </math> In this and other cases where a joint density function exists, the likelihood function is defined as above, in the section "[[Maximum likelihood#Principles|principles]]," using this density. === Example === <math>X_1,\ X_2,\ldots,\ X_m</math> are counts in cells / boxes 1 up to m; each box has a different probability (think of the boxes being bigger or smaller) and we fix the number of balls that fall to be <math>n</math>:<math>x_1+x_2+\cdots+x_m=n</math>. The probability of each box is <math>p_i</math>, with a constraint: <math>p_1+p_2+\cdots+p_m=1</math>. This is a case in which the <math>X_i</math> ''s'' are not independent, the joint probability of a vector <math>x_1,\ x_2,\ldots,x_m</math> is called the multinomial and has the form: <math display="block">f(x_1,x_2,\ldots,x_m\mid p_1,p_2,\ldots,p_m)=\frac{n!}{\prod x_i!}\prod p_i^{x_i}= \binom{n}{x_1,x_2,\ldots,x_m} p_1^{x_1} p_2^{x_2} \cdots p_m^{x_m}</math> Each box taken separately against all the other boxes is a binomial and this is an extension thereof. The log-likelihood of this is: <math display="block">\ell(p_1,p_2,\ldots,p_m)=\log n!-\sum_{i=1}^m \log x_i!+\sum_{i=1}^m x_i\log p_i</math> The constraint has to be taken into account and use the Lagrange multipliers: <math display="block">L(p_1,p_2,\ldots,p_m,\lambda)=\ell(p_1,p_2,\ldots,p_m)+\lambda\left(1-\sum_{i=1}^m p_i\right)</math> By posing all the derivatives to be 0, the most natural estimate is derived <math display="block">\hat{p}_i=\frac{x_i}{n}</math> Maximizing log likelihood, with and without constraints, can be an unsolvable problem in closed form, then we have to use iterative procedures.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)