Editing Expected value (section)

=== Inequalities===
[[Concentration inequalities]] control the likelihood of a random variable taking on large values. [[Markov's inequality]] is among the best-known and simplest to prove: for a ''nonnegative'' random variable {{mvar|X}} and any positive number {{mvar|a}}, it states that{{sfnm|1a1=Feller|1y=1968|1loc=Section IX.6|2a1=Feller|2y=1971|2loc=Section V.7|3a1=Papoulis|3a2=Pillai|3y=2002|3loc=Section 5-4|4a1=Ross|4y=2019|4loc=Section 2.8}} <math display="block">
\operatorname{P}(X\geq a)\leq\frac{\operatorname{E}[X]}{a}.
</math>

If {{mvar|X}} is any random variable with finite expectation, then Markov's inequality may be applied to the random variable {{math|{{!}}''X''−E[''X'']{{!}}<sup>2</sup>}} to obtain [[Chebyshev's inequality]] <math display="block">
\operatorname{P}(|X-\text{E}[X]|\geq a)\leq\frac{\operatorname{Var}[X]}{a^2},
</math>
where {{math|Var}} is the [[variance]].{{sfnm|1a1=Feller|1y=1968|1loc=Section IX.6|2a1=Feller|2y=1971|2loc=Section V.7|3a1=Papoulis|3a2=Pillai|3y=2002|3loc=Section 5-4|4a1=Ross|4y=2019|4loc=Section 2.8}} These inequalities are significant for their nearly complete lack of conditional assumptions. For example, for any random variable with finite expectation, the Chebyshev inequality implies that there is at least a 75% probability of an outcome being within two [[standard deviation]]s of the expected value. However, in special cases the Markov and Chebyshev inequalities often give much weaker information than is otherwise available. For example, in the case of an unweighted dice, Chebyshev's inequality says that odds of rolling between 1 and 6 is at least 53%; in reality, the odds are of course 100%.{{sfnm|1a1=Feller|1y=1968|1loc=Section IX.6}} The [[Kolmogorov inequality]] extends the Chebyshev inequality to the context of sums of random variables.{{sfnm|1a1=Feller|1y=1968|1loc=Section IX.7}}

The following three inequalities are of fundamental importance in the field of [[mathematical analysis]] and its applications to probability theory.
* [[Jensen's inequality]]: Let {{math|''f'': '''R''' → '''R'''}} be a [[convex function]] and {{mvar|X}} a random variable with finite expectation. Then{{sfnm|1a1=Feller|1y=1971|1loc=Section V.8}} <math display="block">
f(\operatorname{E}(X)) \leq \operatorname{E} (f(X)).
</math> Part of the assertion is that the [[positive and negative parts|negative part]] of {{math|''f''(''X'')}} has finite expectation, so that the right-hand side is well-defined (possibly infinite). Convexity of {{mvar|f}} can be phrased as saying that the output of the weighted average of ''two'' inputs under-estimates the same weighted average of the two outputs; Jensen's inequality extends this to the setting of completely general weighted averages, as represented by the expectation. In the special case that {{math|1=''f''(''x'') = {{abs|''x''}}<sup>''t''/''s''</sup>}} for positive numbers {{math|''s'' &lt; ''t''}}, one obtains the Lyapunov inequality{{sfnm|1a1=Billingsley|1y=1995|1pp=81,277}} <math display="block">
\left(\operatorname{E}|X|^s\right)^{1/s} \leq \left(\operatorname{E}|X|^t\right)^{1/t}.
</math> This can also be proved by the Hölder inequality.{{sfnm|1a1=Feller|1y=1971|1loc=Section V.8}} In measure theory, this is particularly notable for proving the inclusion {{math|L<sup>''s''</sup> ⊂ L<sup>''t''</sup>}} of [[Lp space|{{math|L<sup>''p''</sup> spaces}}]], in the special case of [[probability space]]s.
* [[Hölder's inequality]]: if {{math|''p'' &gt; 1}} and {{math|''q'' &gt; 1}} are numbers satisfying {{math|''p''<sup> −1</sup> + ''q''<sup> −1</sup> {{=}} 1}}, then <math display="block">
\operatorname{E}|XY|\leq(\operatorname{E}|X|^p)^{1/p}(\operatorname{E}|Y|^q)^{1/q}.
</math> for any random variables {{mvar|X}} and {{mvar|Y}}.{{sfnm|1a1=Feller|1y=1971|1loc=Section V.8}} The special case of {{math|''p'' {{=}} ''q'' {{=}} 2}} is called the [[Cauchy–Schwarz inequality]], and is particularly well-known.{{sfnm|1a1=Feller|1y=1971|1loc=Section V.8}}
* [[Minkowski inequality]]: given any number {{math|''p'' ≥ 1}}, for any random variables {{mvar|X}} and {{mvar|Y}} with {{math|E{{!}}''X''{{!}}<sup>''p''</sup>}} and {{math|E{{!}}''Y''{{!}}<sup>''p''</sup>}} both finite, it follows that {{math|E{{!}}''X'' + ''Y''{{!}}<sup>''p''</sup>}} is also finite and{{sfnm|1a1=Billingsley|1y=1995|1loc=Section 19}} <math display="block">
\Bigl(\operatorname{E}|X+Y|^p\Bigr)^{1/p}\leq\Bigl(\operatorname{E}|X|^p\Bigr)^{1/p}+\Bigl(\operatorname{E}|Y|^p\Bigr)^{1/p}.
</math>
The Hölder and Minkowski inequalities can be extended to general [[measure space]]s, and are often given in that context. By contrast, the Jensen inequality is special to the case of probability spaces.