Propagation of uncertainty

Template:Short description Template:For

In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations (e.g., instrument precision) which propagate due to the combination of variables in the function.

The uncertainty u can be expressed in a number of ways. It may be defined by the absolute error Template:Math. Uncertainties can also be defined by the relative error Template:Math, which is usually written as a percentage. Most commonly, the uncertainty on a quantity is quantified in terms of the standard deviation, Template:Mvar, which is the positive square root of the variance. The value of a quantity and its error are then expressed as an interval Template:Math. However, the most general way of characterizing uncertainty is by specifying its probability distribution. If the probability distribution of the variable is known or can be assumed, in theory it is possible to get any of its statistics. In particular, it is possible to derive confidence limits to describe the region within which the true value of the variable may be found. For example, the 68% confidence limits for a one-dimensional variable belonging to a normal distribution are approximately ± one standard deviation Template:Math from the central value Template:Math, which means that the region Template:Math will cover the true value in roughly 68% of cases.

If the uncertainties are correlated then covariance must be taken into account. Correlation can arise from two different sources. First, the measurement errors may be correlated. Second, when the underlying values are correlated across a population, the uncertainties in the group averages will be correlated.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

In a general context where a nonlinear function modifies the uncertain parameters (correlated or not), the standard tools to propagate uncertainty, and infer resulting quantity probability distribution/statistics, are sampling techniques from the Monte Carlo method family.<ref name="kr11">Template:Cite book</ref> For very large datasets or complex functions, the calculation of the error propagation may be very expensive so that a surrogate model<ref>Template:Cite journal</ref> or a parallel computing strategy<ref>Template:Cite journal</ref><ref>Template:Cite journal</ref><ref>Template:Cite journal</ref> may be necessary.

In some particular cases, the uncertainty propagation calculation can be done through simplistic algebraic procedures. Some of these scenarios are described below.

Linear combinationsEdit

Let <math>\{f_k(x_1, x_2, \dots, x_n)\}</math> be a set of m functions, which are linear combinations of <math>n</math> variables <math>x_1, x_2, \dots, x_n</math> with combination coefficients <math>A_{k1}, A_{k2}, \dots,A_{kn}, (k = 1, \dots, m)</math>: <math display="block">f_k = \sum_{i=1}^n A_{ki} x_i,</math> or in matrix notation, <math display="block">\mathbf{f} = \mathbf{A} \mathbf{x}.</math>

Also let the variance–covariance matrix of Template:Math be denoted by <math>\boldsymbol\Sigma^x</math> and let the mean value be denoted by <math>\boldsymbol{\mu}</math>: <math display="block">\begin{align} \boldsymbol\Sigma^x = \operatorname{E}[(\mathbf{x}-\boldsymbol\mu)\otimes (\mathbf{x}-\boldsymbol\mu)] &= \begin{pmatrix}

  \sigma^2_1 & \sigma_{12} & \sigma_{13} & \cdots \\
  \sigma_{21} & \sigma^2_2 & \sigma_{23} & \cdots\\
  \sigma_{31} & \sigma_{32} & \sigma^2_3 & \cdots \\
  \vdots & \vdots & \vdots & \ddots

\end{pmatrix} \\[1ex] &= \begin{pmatrix}

  {\Sigma}^x_{11} & {\Sigma}^x_{12} & {\Sigma}^x_{13} & \cdots \\
  {\Sigma}^x_{21} & {\Sigma}^x_{22} & {\Sigma}^x_{23} & \cdots \\
  {\Sigma}^x_{31} & {\Sigma}^x_{32} & {\Sigma}^x_{33} & \cdots \\
  \vdots & \vdots & \vdots & \ddots

\end{pmatrix}. \end{align} </math> <math>\otimes</math> is the outer product.

Then, the variance–covariance matrix <math>\boldsymbol\Sigma^f</math> of f is given by <math display="block">\begin{align} \boldsymbol\Sigma^f &= \operatorname{E}\left[(\mathbf{f} - \operatorname{E}[\mathbf{f}]) \otimes (\mathbf{f} - \operatorname{E}[\mathbf{f}])\right] = \operatorname{E}\left[\mathbf{A}(\mathbf{x}-\boldsymbol\mu) \otimes \mathbf{A}(\mathbf{x}-\boldsymbol\mu)\right] \\[1ex] &= \mathbf{A} \operatorname{E}\left[(\mathbf{x}-\boldsymbol\mu) \otimes (\mathbf{x}-\boldsymbol\mu)\right] \mathbf{A}^\mathrm{T} = \mathbf{A} \boldsymbol\Sigma^x \mathbf{A}^\mathrm{T}. \end{align}</math>

In component notation, the equation <math display="block">\boldsymbol\Sigma^f = \mathbf{A} \boldsymbol\Sigma^x \mathbf{A}^\mathrm{T}</math> reads <math display="block">\Sigma^f_{ij} = \sum_k^n \sum_l^n A_{ik} {\Sigma}^x_{kl} A_{jl}.</math>

This is the most general expression for the propagation of error from one set of variables onto another. When the errors on x are uncorrelated, the general expression simplifies to <math display="block">\Sigma^f_{ij} = \sum_k^n A_{ik} \Sigma^x_k A_{jk},</math> where <math>\Sigma^x_k = \sigma^2_{x_k}</math> is the variance of k-th element of the x vector. Note that even though the errors on x may be uncorrelated, the errors on f are in general correlated; in other words, even if <math>\boldsymbol\Sigma^x</math> is a diagonal matrix, <math>\boldsymbol\Sigma^f</math> is in general a full matrix.

The general expressions for a scalar-valued function f are a little simpler (here a is a row vector): <math display="block">f = \sum_i^n a_i x_i = \mathbf{a x},</math> <math display="block">\sigma^2_f = \sum_i^n \sum_j^n a_i \Sigma^x_{ij} a_j = \mathbf{a} \boldsymbol\Sigma^x \mathbf{a}^\mathrm{T}.</math>

Each covariance term <math>\sigma_{ij}</math> can be expressed in terms of the correlation coefficient <math>\rho_{ij}</math> by <math>\sigma_{ij} = \rho_{ij} \sigma_i \sigma_j</math>, so that an alternative expression for the variance of f is <math display="block">\sigma^2_f = \sum_i^n a_i^2 \sigma^2_i + \sum_i^n \sum_{j (j \ne i)}^n a_i a_j \rho_{ij} \sigma_i \sigma_j.</math>

In the case that the variables in x are uncorrelated, this simplifies further to <math display="block">\sigma^2_f = \sum_i^n a_i^2 \sigma^2_i.</math>

In the simple case of identical coefficients and variances, we find <math display="block">\sigma_f = \sqrt{n}\, |a| \sigma.</math>

For the arithmetic mean, <math>a=1/n</math>, the result is the standard error of the mean: <math display="block">\sigma_f = \frac{\sigma} {\sqrt{n}}.</math>

Non-linear combinationsEdit

Template:See also When f is a set of non-linear combination of the variables x, an interval propagation could be performed in order to compute intervals which contain all consistent values for the variables. In a probabilistic approach, the function f must usually be linearised by approximation to a first-order Taylor series expansion, though in some cases, exact formulae can be derived that do not depend on the expansion as is the case for the exact variance of products.<ref name="Goodman1960">Template:Cite journal</ref> The Taylor expansion would be: <math display="block">f_k \approx f^0_k+ \sum_i^n \frac{\partial f_k}{\partial {x_i}} x_i </math> where <math>\partial f_k/\partial x_i</math> denotes the partial derivative of f_k with respect to the i-th variable, evaluated at the mean value of all components of vector x. Or in matrix notation, <math display="block">\mathrm{f} \approx \mathrm{f}^0 + \mathrm{J} \mathrm{x}\,</math> where J is the Jacobian matrix. Since f⁰ is a constant it does not contribute to the error on f. Therefore, the propagation of error follows the linear case, above, but replacing the linear coefficients, A_ki and A_kj by the partial derivatives, <math>\frac{\partial f_k}{\partial x_i}</math> and <math>\frac{\partial f_k}{\partial x_j}</math>. In matrix notation,<ref>Ochoa1, Benjamin; Belongie, Serge "Covariance Propagation for Guided Matching" Template:Webarchive</ref> <math display="block">\mathrm{\Sigma}^\mathrm{f} = \mathrm{J} \mathrm{\Sigma}^\mathrm{x} \mathrm{J}^\top.</math>

That is, the Jacobian of the function is used to transform the rows and columns of the variance-covariance matrix of the argument. Note this is equivalent to the matrix expression for the linear case with <math>\mathrm{J = A}</math>.

SimplificationEdit

Neglecting correlations or assuming independent variables yields a common formula among engineers and experimental scientists to calculate error propagation, the variance formula:<ref>Template:Cite journal</ref> <math display="block">s_f = \sqrt{ \left(\frac{\partial f}{\partial x}\right)^2 s_x^2 + \left(\frac{\partial f}{\partial y} \right)^2 s_y^2 + \left(\frac{\partial f}{\partial z} \right)^2 s_z^2 + \cdots}</math> where <math>s_f</math> represents the standard deviation of the function <math>f</math>, <math>s_x</math> represents the standard deviation of <math>x</math>, <math>s_y</math> represents the standard deviation of <math>y</math>, and so forth.

This formula is based on the linear characteristics of the gradient of <math>f</math> and therefore it is a good estimation for the standard deviation of <math>f</math> as long as <math>s_x, s_y, s_z,\ldots</math> are small enough. Specifically, the linear approximation of <math>f</math> has to be close to <math>f</math> inside a neighbourhood of radius <math>s_x, s_y, s_z,\ldots</math>.<ref>Template:Cite book Template:Page needed</ref>

ExampleEdit

Any non-linear differentiable function, <math>f(a,b)</math>, of two variables, <math>a</math> and <math>b</math>, can be expanded as <math display="block">f\approx f^0+\frac{\partial f}{\partial a}a+\frac{\partial f}{\partial b}b.</math> If we take the variance on both sides and use the formula<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> for the variance of a linear combination of variables <math display="block">\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab \operatorname{Cov}(X, Y),</math> then we obtain <math display="block">\sigma^2_f\approx\left| \frac{\partial f}{\partial a}\right| ^2\sigma^2_a+\left| \frac{\partial f}{\partial b}\right|^2\sigma^2_b+2\frac{\partial f}{\partial a}\frac{\partial f} {\partial b}\sigma_{ab},</math> where <math>\sigma_{f}</math> is the standard deviation of the function <math>f</math>, <math>\sigma_{a}</math> is the standard deviation of <math>a</math>, <math>\sigma_{b}</math> is the standard deviation of <math>b</math> and <math>\sigma_{ab} = \sigma_{a}\sigma_{b} \rho_{ab}</math> is the covariance between <math>a</math> and <math>b</math>.

In the particular case that Template:Nowrap Template:Nowrap Template:Nowrap Then <math display="block">\sigma^2_f \approx b^2\sigma^2_a+a^2 \sigma_b^2+2ab\,\sigma_{ab}</math> or <math display="block">\left(\frac{\sigma_f}{f}\right)^2 \approx \left(\frac{\sigma_a}{a} \right)^2 + \left(\frac{\sigma_b}{b}\right)^2 + 2\left(\frac{\sigma_a}{a}\right)\left(\frac{\sigma_b}{b}\right)\rho_{ab}</math> where <math>\rho_{ab}</math> is the correlation between <math>a</math> and <math>b</math>.

When the variables <math>a</math> and <math>b</math> are uncorrelated, <math>\rho_{ab}=0</math>. Then <math display="block">\left(\frac{\sigma_f}{f}\right)^2 \approx \left(\frac{\sigma_a}{a} \right)^2 + \left(\frac{\sigma_b}{b}\right)^2.</math>

Caveats and warningsEdit

Error estimates for non-linear functions are biased on account of using a truncated series expansion. The extent of this bias depends on the nature of the function. For example, the bias on the error calculated for log(1+x) increases as x increases, since the expansion to x is a good approximation only when x is near zero.

For highly non-linear functions, there exist five categories of probabilistic approaches for uncertainty propagation;<ref>Template:Cite journal</ref> see Uncertainty quantification for details.

Reciprocal and shifted reciprocalEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}} In the special case of the inverse or reciprocal <math>1/B</math>, where <math>B=N(0,1)</math> follows a standard normal distribution, the resulting distribution is a reciprocal standard normal distribution, and there is no definable variance.<ref name=Johnson>Template:Cite book</ref>

However, in the slightly more general case of a shifted reciprocal function <math>1/(p-B)</math> for <math>B=N(\mu,\sigma)</math> following a general normal distribution, then mean and variance statistics do exist in a principal value sense, if the difference between the pole <math>p</math> and the mean <math>\mu</math> is real-valued.<ref name=lecomte2013exact>Template:Cite journal</ref>

RatiosEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}} Ratios are also problematic; normal approximations exist under certain conditions.

Example formulaeEdit

This table shows the variances and standard deviations of simple functions of the real variables <math>A, B</math> with standard deviations <math>\sigma_A, \sigma_B,</math> covariance <math>\sigma_{AB} = \rho_{AB} \sigma_A \sigma_B,</math> and correlation <math>\rho_{AB}.</math> The real-valued coefficients <math>a</math> and <math>b</math> are assumed exactly known (deterministic), i.e., <math>\sigma_a = \sigma_b = 0.</math>

In the right-hand columns of the table, <math>A</math> and <math>B</math> are expectation values, and <math>f</math> is the value of the function calculated at those values.

Function	Variance	Standard deviation
<math>f = aA\,</math>	<math>\sigma_f^2 = a^2\sigma_A^2</math>	a\|\sigma_A</math>
<math>f = A + B</math>	<math>\sigma_f^2 = \sigma_A^2 + \sigma_B^2 + 2\sigma_{AB}</math>	<math>\sigma_f = \sqrt{\sigma_A^2 + \sigma_B^2 + 2\sigma_{AB}}</math>
<math>f = A - B</math>	<math>\sigma_f^2 = \sigma_A^2 + \sigma_B^2 - 2\sigma_{AB}</math>	<math>\sigma_f = \sqrt{\sigma_A^2 + \sigma_B^2 - 2\sigma_{AB}}</math>
<math>f = aA + bB</math>	<math>\sigma_f^2 = a^2\sigma_A^2 + b^2\sigma_B^2 + 2ab\,\sigma_{AB}</math>	<math>\sigma_f = \sqrt{a^2\sigma_A^2 + b^2\sigma_B^2 + 2ab\,\sigma_{AB}}</math>
<math>f = aA - bB</math>	<math>\sigma_f^2 = a^2\sigma_A^2 + b^2\sigma_B^2 - 2ab\,\sigma_{AB}</math>	<math>\sigma_f = \sqrt{a^2\sigma_A^2 + b^2\sigma_B^2 - 2ab\,\sigma_{AB}}</math>
<math>f = AB</math>	citation	CitationClass=web }}</ref><ref>{{#invoke:citation/CS1\|citation	CitationClass=web }}</ref>	f \right\| \sqrt{ \left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 + 2\frac{\sigma_{AB}}{AB} }</math>
<math>f = \frac{A}{B}</math>	citation	CitationClass=web }}</ref>	f \right\| \sqrt{ \left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 - 2\frac{\sigma_{AB}}{AB} }</math>
<math>f = \frac{A}{A+B}</math>	<math>\sigma_f^2 \approx \frac{f^2}{\left(A+B\right)^2} \left(\frac{B^2}{A^2}\sigma_A^2 +\sigma_B^2 - 2\frac{B}{A} \sigma_{AB} \right)</math>	\frac{f}{A+B}\right\| \sqrt{\frac{B^2}{A^2}\sigma_A^2 +\sigma_B^2 - 2\frac{B}{A} \sigma_{AB} }</math>
<math>f = a A^b</math>	<math>\sigma_f^2 \approx \left( {a}{b}{A}^{b-1}{\sigma_A} \right)^2 = \left( \frac{{f}{b}{\sigma_A}}{A} \right)^2 </math>	{a}{b}{A}^{b-1}{\sigma_A} \right\| = \left\| \frac{{f}{b}{\sigma_A}}{A} \right\| </math>
<math>f = a \ln(bA)</math>	<math>\sigma_f^2 \approx \left(a \frac{\sigma_A}{A} \right)^2</math><ref name=harris2003>Template:Citation</ref>	a \frac{\sigma_A}{A}\right\|</math>
<math>f = a \log_{10}(bA)</math>	<math>\sigma_f^2 \approx \left(a \frac{\sigma_A}{A \ln(10)} \right)^2</math><ref name=harris2003/>	a \frac{\sigma_A}{A \ln(10)} \right\|</math>
<math>f = a e^{bA}</math>	citation	CitationClass=web }}</ref>	f \right\| \left\| \left( b\sigma_A \right) \right\| </math>
<math>f = a^{bA}</math>	<math>\sigma_f^2 \approx f^2 (b\ln(a)\sigma_A)^2</math>	f \right\| \left\| b \ln(a) \sigma_A \right\|</math>
<math>f = a \sin(bA)</math>	<math>\sigma_f^2 \approx \left[ a b \cos(b A) \sigma_A \right]^2</math>	a b \cos(b A) \sigma_A \right\|</math>
<math>f = a \cos \left( b A \right)\,</math>	<math>\sigma_f^2 \approx \left[ a b \sin(b A) \sigma_A \right]^2</math>	a b \sin(b A) \sigma_A \right\|</math>
<math>f = a \tan \left( b A \right)\,</math>	<math>\sigma_f^2 \approx \left[ a b \sec^2(b A) \sigma_A \right]^2</math>	a b \sec^2(b A) \sigma_A \right\|</math>
<math>f = A^B</math>	<math>\sigma_f^2 \approx f^2 \left[ \left( \frac{B}{A}\sigma_A \right)^2 +\left( \ln(A)\sigma_B \right)^2 + 2 \frac{B \ln(A)}{A} \sigma_{AB} \right]</math>	f \right\| \sqrt{ \left( \frac{B}{A}\sigma_A \right)^2 +\left( \ln(A)\sigma_B \right)^2 + 2 \frac{B \ln(A)}{A} \sigma_{AB} } </math>
<math>f = \sqrt{aA^2 \pm bB^2}</math>	<math>\sigma_f^2 \approx \left(\frac{A}{f}\right)^2 a^2\sigma_A^2 + \left(\frac{B}{f}\right)^2 b^2\sigma_B^2 \pm 2ab\frac{AB}{f^2}\,\sigma_{AB}</math>	<math>\sigma_f \approx \sqrt{\left(\frac{A}{f}\right)^2 a^2\sigma_A^2 + \left(\frac{B}{f}\right)^2 b^2\sigma_B^2 \pm 2ab\frac{AB}{f^2}\,\sigma_{AB}}</math>

For uncorrelated variables (<math>\rho_{AB} = 0</math>, <math>\sigma_{AB} = 0</math>) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives <math display="block">f = ABC; \qquad \left(\frac{\sigma_f}{f}\right)^2 \approx \left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2+ \left(\frac{\sigma_C}{C}\right)^2.</math>

For the case <math>f = AB </math> we also have Goodman's expression<ref name="Goodman1960"/> for the exact variance: for the uncorrelated case it is <math display="block">\operatorname{V}[XY] = \operatorname{E}[X]^2 \operatorname{V}[Y] + \operatorname{E}[Y]^2 \operatorname{V}[X] + \operatorname{E}\left[\left(X - \operatorname{E}(X)\right)^2 \left(Y - \operatorname{E}(Y)\right)^2\right],</math> and therefore we have <math display="block">\sigma_f^2 = A^2\sigma_B^2 + B^2\sigma_A^2 + \sigma_A^2\sigma_B^2.</math>

Effect of correlation on differencesEdit

If A and B are uncorrelated, their difference A − B will have more variance than either of them. An increasing positive correlation (<math>\rho_{AB} \to 1</math>) will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the same variance. On the other hand, a negative correlation (<math>\rho_{AB} \to -1</math>) will further increase the variance of the difference, compared to the uncorrelated case.

For example, the self-subtraction f = A − A has zero variance <math>\sigma_f^2 = 0</math> only if the variate is perfectly autocorrelated (<math>\rho_A = 1</math>). If A is uncorrelated, <math>\rho_A = 0,</math> then the output variance is twice the input variance, <math>\sigma_f^2 = 2\sigma^2_A.</math> And if A is perfectly anticorrelated, <math>\rho_A = -1,</math> then the input variance is quadrupled in the output, <math>\sigma_f^2 = 4 \sigma^2_A</math> (notice <math>1 - \rho_A = 2</math> for f = aA − aA in the table above).

Example calculationsEdit

Inverse tangent functionEdit

We can calculate the uncertainty propagation for the inverse tangent function as an example of using partial derivatives to propagate error.

Define <math display="block">f(x) = \arctan(x),</math> where <math>\Delta_x</math> is the absolute uncertainty on our measurement of Template:Mvar. The derivative of Template:Math with respect to Template:Mvar is <math display="block">\frac{d f}{d x} = \frac{1}{1+x^2}.</math>

Therefore, our propagated uncertainty is <math display="block">\Delta_{f} \approx \frac{\Delta_x}{1+x^2},</math> where <math>\Delta_f</math> is the absolute propagated uncertainty.

Resistance measurementEdit

A practical application is an experiment in which one measures current, Template:Mvar, and voltage, Template:Mvar, on a resistor in order to determine the resistance, Template:Mvar, using Ohm's law, Template:Math.

Given the measured variables with uncertainties, Template:Math and Template:Math, and neglecting their possible correlation, the uncertainty in the computed quantity, Template:Math, is:

<math display="block">\sigma_R \approx \sqrt{ \sigma_V^2 \left(\frac{1}{I}\right)^2 + \sigma_I^2 \left(\frac{-V}{I^2}\right)^2 } = R\sqrt{ \left(\frac{\sigma_V}{V}\right)^2 + \left(\frac{\sigma_I}{I}\right)^2 }.</math>

ReferencesEdit

Template:Reflist

External linksEdit

A detailed discussion of measurements and the propagation of uncertainty explaining the benefits of using error propagation formulas and Monte Carlo simulations instead of simple significance arithmetic
GUM, Guide to the Expression of Uncertainty in Measurement
EPFL An Introduction to Error Propagation, Derivation, Meaning and Examples of Cy = Fx Cx Fx'
uncertainties package, a program/library for transparently performing calculations with uncertainties (and error correlations).
soerp package, a Python program/library for transparently performing *second-order* calculations with uncertainties (and error correlations).
Template:Cite tech report
Uncertainty Calculator Propagate uncertainty for any expression

Template:Authority control