Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Probability density function
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Function of random variables and change of variables in the probability density function== If the probability density function of a random variable (or vector) {{math|''X''}} is given as {{math|''f<sub>X</sub>''(''x'')}}, it is possible (but often not necessary; see below) to calculate the probability density function of some variable {{math|1=''Y'' = ''g''(''X'')}}. This is also called a "change of variable" and is in practice used to generate a random variable of arbitrary shape {{math|1=''f''<sub>''g''(''X'')</sub> = ''f<sub>Y</sub>''}} using a known (for instance, uniform) random number generator. It is tempting to think that in order to find the expected value {{math|E(''g''(''X''))}}, one must first find the probability density {{math|''f''<sub>''g''(''X'')</sub>}} of the new random variable {{math|1=''Y'' = ''g''(''X'')}}. However, rather than computing <math display="block">\operatorname E\big(g(X)\big) = \int_{-\infty}^\infty y f_{g(X)}(y)\,dy, </math> one may find instead <math display="block">\operatorname E\big(g(X)\big) = \int_{-\infty}^\infty g(x) f_X(x)\,dx.</math> The values of the two integrals are the same in all cases in which both {{math|''X''}} and {{math|''g''(''X'')}} actually have probability density functions. It is not necessary that {{math|''g''}} be a [[one-to-one function]]. In some cases the latter integral is computed much more easily than the former. See [[Law of the unconscious statistician]]. ===Scalar to scalar=== Let <math> g: \Reals \to \Reals</math> be a [[monotonic function]], then the resulting density function is<ref>{{cite web |last1=Siegrist |first1=Kyle |title=Transformations of Random Variables |date=5 May 2020 |url=https://stats.libretexts.org/Bookshelves/Probability_Theory/Probability_Mathematical_Statistics_and_Stochastic_Processes_%28Siegrist%29/03%3A_Distributions/3.07%3A_Transformations_of_Random_Variables#The_Change_of_Variables_Formula |publisher=LibreTexts Statistics |access-date=22 December 2023}}</ref> <math display="block">f_Y(y) = f_X\big(g^{-1}(y)\big) \left| \frac{d}{dy} \big(g^{-1}(y)\big) \right|.</math> Here {{math|''g''<sup>β1</sup>}} denotes the [[inverse function]]. This follows from the fact that the probability contained in a differential area must be invariant under change of variables. That is, <math display="block">\left| f_Y(y)\, dy \right| = \left| f_X(x)\, dx \right|,</math> or <math display="block">f_Y(y) = \left| \frac{dx}{dy} \right| f_X(x) = \left| \frac{d}{dy} (x) \right| f_X(x) = \left| \frac{d}{dy} \big(g^{-1}(y)\big) \right| f_X\big(g^{-1}(y)\big) = {\left|\left(g^{-1}\right)'(y)\right|} \cdot f_X\big(g^{-1}(y)\big) .</math> For functions that are not monotonic, the probability density function for {{mvar|y}} is <math display="block">\sum_{k=1}^{n(y)} \left| \frac{d}{dy} g^{-1}_{k}(y) \right| \cdot f_X\big(g^{-1}_{k}(y)\big),</math> where {{math|''n''(''y'')}} is the number of solutions in {{mvar|x}} for the equation <math>g(x) = y</math>, and <math>g_k^{-1}(y)</math> are these solutions. ===Vector to vector=== Suppose {{math|'''x'''}} is an {{mvar|n}}-dimensional random variable with joint density {{math|''f''}}. If {{math|1='''''y''''' = ''G''('''''x''''')}}, where {{math|''G''}} is a [[bijective]], [[differentiable function]], then {{math|'''''y'''''}} has density {{math|{{ math | ''p''<sub>'''''Y'''''</sub>}}}}: <math display="block"> p_{Y}(\mathbf{y}) = f\Bigl(G^{-1}(\mathbf{y})\Bigr) \left| \det\left[\left.\frac{dG^{-1}(\mathbf{z})}{d\mathbf{z}}\right|_{\mathbf{z}=\mathbf{y}}\right] \right|</math> with the differential regarded as the [[Jacobian matrix and determinant|Jacobian]] of the inverse of {{math|''G''(β )}}, evaluated at {{math|'''''y'''''}}.<ref>{{cite book |first1=Jay L. |last1=Devore |first2=Kenneth N. |last2=Berk |title=Modern Mathematical Statistics with Applications |publisher=Cengage |year=2007 |isbn=978-0-534-40473-4 |page=263 |url=https://books.google.com/books?id=3X7Qca6CcfkC&pg=PA263 }}</ref> For example, in the 2-dimensional case {{math|1='''x''' = (''x''<sub>1</sub>, ''x''<sub>2</sub>)}}, suppose the transform {{math|''G''}} is given as {{math|1=''y''<sub>1</sub> = ''G''<sub>1</sub>(''x''<sub>1</sub>, ''x''<sub>2</sub>)}}, {{math|1=''y''<sub>2</sub> = ''G''<sub>2</sub>(''x''<sub>1</sub>, ''x''<sub>2</sub>)}} with inverses {{math|1=''x''<sub>1</sub> = ''G''<sub>1</sub><sup>β1</sup>(''y''<sub>1</sub>, ''y''<sub>2</sub>)}}, {{math|1=''x''<sub>2</sub> = ''G''<sub>2</sub><sup>β1</sup>(''y''<sub>1</sub>, ''y''<sub>2</sub>)}}. The joint distribution for '''y''' = (''y''<sub>1</sub>, y<sub>2</sub>) has density<ref>{{Cite book |title=Elementary Probability |last=David |first=Stirzaker |date=2007-01-01 |publisher=Cambridge University Press |isbn=978-0521534284 |oclc=851313783}}</ref> <math display="block">p_{Y_1, Y_2}(y_1,y_2) = f_{X_1,X_2}\big(G_1^{-1}(y_1,y_2), G_2^{-1}(y_1,y_2)\big) \left\vert \frac{\partial G_1^{-1}}{\partial y_1} \frac{\partial G_2^{-1}}{\partial y_2} - \frac{\partial G_1^{-1}}{\partial y_2} \frac{\partial G_2^{-1}}{\partial y_1} \right\vert.</math> ===Vector to scalar=== Let <math> V: \R^n \to \R </math> be a differentiable function and <math> X </math> be a random vector taking values in <math> \R^n </math>, <math> f_X </math> be the probability density function of <math> X </math> and <math> \delta(\cdot) </math> be the [[Dirac delta]] function. It is possible to use the formulas above to determine <math> f_Y </math>, the probability density function of <math> Y = V(X) </math>, which will be given by <math display="block">f_Y(y) = \int_{\R^n} f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \,d \mathbf{x}.</math> This result leads to the [[law of the unconscious statistician]]: <math display="block">\begin{align} \operatorname{E}_Y[Y] &=\int_{\R} y f_Y(y) \, dy \\ &= \int_{\R} y \int_{\R^n} f_X(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \,d \mathbf{x} \,dy \\ &= \int_{{\mathbb R}^n} \int_{\mathbb R} y f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \, dy \, d \mathbf{x} \\ &= \int_{\mathbb R^n} V(\mathbf{x}) f_X(\mathbf{x}) \, d \mathbf{x}=\operatorname{E}_X[V(X)]. \end{align}</math> ''Proof:'' Let <math>Z</math> be a collapsed random variable with probability density function <math>p_Z(z) = \delta(z)</math> (i.e., a constant equal to zero). Let the random vector <math>\tilde{X}</math> and the transform <math>H</math> be defined as <math display="block">H(Z,X)=\begin{bmatrix} Z+V(X)\\ X\end{bmatrix}=\begin{bmatrix} Y\\ \tilde{X}\end{bmatrix}.</math> It is clear that <math>H</math> is a bijective mapping, and the Jacobian of <math>H^{-1}</math> is given by: <math display="block">\frac{dH^{-1}(y,\tilde{\mathbf{x}})}{dy\,d\tilde{\mathbf{x}}}=\begin{bmatrix} 1 & -\frac{dV(\tilde{\mathbf{x}})}{d\tilde{\mathbf{x}}}\\ \mathbf{0}_{n\times1} & \mathbf{I}_{n\times n} \end{bmatrix},</math> which is an upper triangular matrix with ones on the main diagonal, therefore its determinant is 1. Applying the change of variable theorem from the previous section we obtain that <math display="block">f_{Y,X}(y,x) = f_X(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big),</math> which if marginalized over <math>x</math> leads to the desired probability density function.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)