Editing Lagrange multiplier (section)

=== Example 4 – Entropy ===

Suppose we wish to find the [[Probability distribution#Discrete probability distribution|discrete probability distribution]] on the points <math>\ \{p_1, p_2, \ldots, p_n\}\ </math> with maximal [[information entropy]]. This is the same as saying that we wish to find the [[Principle of maximum entropy|least structured]] probability distribution on the points <math>\ \{p_1, p_2, \cdots, p_n\} ~.</math> In other words, we wish to maximize the [[Shannon entropy]] equation:
<math display="block"> f(p_1,p_2,\ldots,p_n) = -\sum_{j=1}^n p_j \log_2 p_j ~.</math>

For this to be a probability distribution the sum of the probabilities <math>\ p_i\ </math> at each point <math>\ x_i\ </math> must equal 1, so our constraint is:
<math display="block"> g(p_1,p_2,\ldots,p_n)=\sum_{j=1}^n p_j = 1 ~.</math>

We use Lagrange multipliers to find the point of maximum entropy, <math>\ \vec{p}^{\,*}\ ,</math> across all discrete probability distributions <math>\ \vec{p}\ </math> on <math>\ \{x_1,x_2, \ldots, x_n\} ~.</math> We require that:
<math display="block">\left. \frac{\partial}{\partial \vec{p}}(f+\lambda (g-1)) \right|_{\vec{p}=\vec{p}^{\,*}} = 0\ ,</math>
which gives a system of {{mvar|n}} equations, <math>\ k = 1,\ \ldots, n\ ,</math> such that:
<math display="block">\left. \frac{\partial}{\partial p_k}\left\{ -\left (\sum_{j=1}^n p_j \log_2 p_j \right ) + \lambda \left(\sum_{j=1}^n p_j - 1 \right) \right\} \right|_{ p_k = p_{\star k} } = 0 ~.</math>

Carrying out the differentiation of these {{mvar|n}} equations, we get
<math display="block"> -\left(\frac{1}{\ln 2}+\log_2 p_{\star k} \right) + \lambda = 0 ~.</math>

This shows that all <math>\ p_{\star k}\ </math> are equal (because they depend on {{mvar|λ}} only). By using the constraint
<math display="block">\sum_j p_j =1\ ,</math>
we find
<math display="block"> p_{\star k} = \frac{1}{n} ~.</math>

Hence, the uniform distribution is the distribution with the greatest entropy, among distributions on {{mvar|n}} points.