Editing Condition number

{{Short description|Function's sensitivity to argument change}}
In [[numerical analysis]], the '''condition number''' of a [[function (mathematics)|function]] measures how much the output value of the function can change for a small change in the input argument. This is used to measure how [[sensitivity analysis|sensitive]] a function is to changes or errors in the input, and how much error in the output results from an error in the input. Very frequently, one is solving the inverse problem: given <math>f(x) = y,</math> one is solving for ''x,'' and thus the condition number of the (local) inverse must be used.<ref>{{cite book |first1=David A. |last1=Belsley |first2=Edwin |last2=Kuh |author-link2=Edwin Kuh |first3=Roy E. |last3=Welsch |chapter=The Condition Number |title=Regression Diagnostics: Identifying Influential Data and Sources of Collinearity |location=New York |publisher=John Wiley & Sons |year=1980 |isbn=0-471-05856-4 |pages=100–104 | chapter-url =https://books.google.com/books?id=GECBEUJVNe0C&pg=PA100 }}</ref><ref>{{cite book |first=M. Hashem |last=Pesaran |author-link=M. Hashem Pesaran |chapter =The Multicollinearity Problem |title=Time Series and Panel Data Econometrics |location=New York |publisher=Oxford University Press |year=2015 | isbn=978-0-19-875998-0 |pages=67–72 [p. 70] |chapter-url=https://books.google.com/books?id=7RokCwAAQBAJ&pg=PA70 }}</ref>

The condition number is derived from the theory of [[propagation of uncertainty]], and is formally defined as the value of the [[Asymptotic analysis|asymptotic]] worst-case relative change in output for a relative change in input. The "function" is the solution of a problem and the "arguments" are the data in the problem. The condition number is frequently applied to questions in [[linear algebra]], in which case the derivative is straightforward but the error could be in many different directions, and is thus computed from the geometry of the matrix. More generally, condition numbers can be defined for non-linear functions in several variables.

A problem with a low condition number is said to be '''''well-conditioned''''', while a problem with a high condition number is said to be '''''ill-conditioned'''''. In non-mathematical terms, an ill-conditioned problem is one where, for a small change in the inputs (the [[independent variables]]) there is a large change in the answer or [[dependent variable]]. This means that the correct solution/answer to the equation becomes hard to find. The condition number is a property of the problem.  Paired with the problem are any number of algorithms that can be used to solve the problem, that is, to calculate the solution.  Some algorithms have a property called ''[[Numerical stability|backward stability]]''; in general, a backward stable algorithm can be expected to accurately solve well-conditioned problems. Numerical analysis textbooks give formulas for the condition numbers of problems and identify known backward stable algorithms.

As a rule of thumb, if the condition number <math>\kappa(A) = 10^k</math>, then you may lose up to <math>k</math> digits of accuracy on top of what would be lost to the numerical method due to loss of precision from arithmetic methods.<ref name="Numerical Mathematics and Computing, by Cheney and Kincaid">{{cite book|url=https://books.google.com/books?id=ZUfVZELlrMEC&pg=PA321 |title= Numerical Mathematics and Computing |last1=Cheney |last2=Kincaid |isbn= 978-0-495-11475-8 |date=2008 |page=321 |publisher= Cengage Learning }}</ref> However, the condition number does not give the exact value of the maximum inaccuracy that may occur in the algorithm. It generally just bounds it with an estimate (whose computed value depends on the choice of the norm to measure the inaccuracy).

== Matrices ==<!-- This section is linked from [[Invertible matrix]] -->

For example, the condition number associated with the [[linear equation]]
''Ax''&nbsp;=&nbsp;''b'' gives a bound on how inaccurate the solution ''x'' will be after approximation. Note that this is before the effects of [[round-off error]] are taken into account; conditioning is a property of the [[matrix (mathematics)|matrix]], not the [[algorithm]] or [[floating-point]] accuracy of the computer used to solve the corresponding system. In particular, one should think of the condition number as being (very roughly) the rate at which the solution ''x'' will change with respect to a change in ''b''.  Thus, if the condition number is large, even a small error in ''b'' may cause a large error in ''x''. On the other hand, if the condition number is small, then the error in ''x'' will not be much bigger than the error in ''b''.

The condition number is defined more precisely to be the maximum ratio of the [[relative error]] in ''x'' to the relative error in ''b''.

Let ''e'' be the error in ''b''. Assuming that ''A'' is a [[nonsingular matrix|nonsingular]] matrix, the error in the solution ''A''<sup>−1</sup>''b'' is ''A''<sup>−1</sup>''e''. The ratio of the relative error in the solution to the relative error in ''b'' is

: <math>{\frac{\left\|A^{-1} e\right\|}{\left\|A^{-1} b\right\|}}/{\frac{\|e\|}{\|b\|}} = \frac{\left\|A^{-1} e\right\|}{\|e\|} \frac{\|b\|}{\left\|A^{-1} b\right\|}.</math>

The maximum value (for nonzero ''b'' and ''e'') is then seen to be the product of the two [[operator norm]]s as follows:

:<math>\begin{align}
  \max_{e,b \neq 0} \left\{ \frac{\left\| A^{-1}e \right\|}{\| e \|} \frac{\| b \|}{\left\| A^{-1}b \right\|} \right\}
    &= \max_{e \neq 0} \left\{\frac{\left\| A^{-1}e\right\| }{\| e\|} \right\} \, \max_{b \neq 0} \left\{ \frac {\| b \|}{\left\| A^{-1}b \right\|} \right\} \\
    &= \max_{e \neq 0} \left\{\frac{\left\| A^{-1}e\right\|}{\| e \|}\right\} \, \max_{x \neq 0} \left \{\frac {\| Ax \| }{\| x \|} \right\} \\
    &= \left\| A^{-1} \right \| \, \|A\|.
\end{align}</math>

The same definition is used for any consistent [[matrix norm|norm]], i.e. one that satisfies

: <math>\kappa(A) = \left\| A^{-1} \right\| \, \left\| A \right\| \ge \left\| A^{-1} A \right\| = 1.</math>

When the condition number is exactly one (which can only happen if ''A'' is a scalar multiple of a [[Isometry#Linear isometry|linear isometry]]), then a solution algorithm can find (in principle, meaning if the algorithm introduces no errors of its own) an approximation of the solution whose precision is no worse than that of the data.

However, it does not mean that the algorithm will converge rapidly to this solution, just that it will not diverge arbitrarily because of inaccuracy on the source data (backward error), provided that the forward error introduced by the algorithm does not diverge as well because of accumulating intermediate rounding errors.{{clarify|date=October 2014}}

The condition number may also be infinite, but this implies that the problem is [[well-posed problem|ill-posed]] (does not possess a unique, well-defined solution for each choice of data; that is, the matrix is not [[invertible matrix|invertible]]), and no algorithm can be expected to reliably find a solution.

The definition of the condition number depends on the choice of [[Norm (mathematics)|norm]], as can be illustrated by two examples.

If <math>\|\cdot\|</math> is the [[Matrix norm#Matrix norms induced by vector norms|matrix norm induced by the (vector) Euclidean norm]] (sometimes known as the ''L''<sup>2</sup> norm and typically denoted as <math>\|\cdot\|_2</math>), then

: <math>\kappa(A) = \frac{\sigma_\text{max}(A)}{\sigma_\text{min}(A)},</math>

where <math>\sigma_\text{max}(A)</math> and <math>\sigma_\text{min}(A)</math> are maximal and minimal [[singular value]]s of <math>A</math> respectively. Hence:

* If <math>A</math> is [[normal matrix|normal]], then <math display="block">\kappa(A) = \frac{\max\{\left|\lambda(A)\right|\}}{\min\{\left|\lambda(A)\right|\}},</math> where <math>\lambda_\text{max}(A)</math> and <math>\lambda_\text{min}(A) </math> are maximal and minimal (by moduli) [[eigenvalue]]s of <math>A</math> respectively.
* If <math>A</math> is [[unitary matrix|unitary]], then <math>\kappa(A) = 1.</math>

The condition number with respect to ''L''<sup>2</sup> arises so often in [[numerical linear algebra]] that it is given a name, the '''condition number of a matrix'''.

If <math>\|\cdot\|</math> is the [[Matrix norm#Matrix norms induced by vector norms|matrix norm induced by the  <math>L^\infty</math> (vector) norm]] and <math>A</math> is [[triangular matrix|lower triangular]] non-singular (i.e. <math>a_{ii} \ne 0</math> for all <math>i</math>), then

: <math>\kappa(A) \geq \frac{\max_i\big(|a_{ii}|\big)}{\min_i\big(|a_{ii}|\big)}</math>

recalling that the eigenvalues of any triangular matrix are simply the diagonal entries.

The condition number computed with this norm is generally larger than the condition number computed relative to the [[Euclidean norm]], but it can be evaluated more easily (and this is often the only practicably computable condition number, when the problem to solve involves a ''non-linear algebra''{{what?|date=October 2014}}, for example when approximating irrational and [[transcendental function|transcendental]] functions or numbers with numerical methods).

If the condition number is not significantly larger than one, the matrix is [[well-conditioned]], which means that its inverse can be computed with good accuracy. If the condition number is very large, then the matrix is said to be [[ill-conditioned]]. Practically, such a matrix is almost singular, and the computation of its inverse, or solution of a linear system of equations is prone to large numerical errors. 

A matrix that is not invertible is often said to have a condition number equal to infinity.  Alternatively, it can be defined as <math>\kappa(A) = \|A\| \|A^\dagger\|</math>, where <math>A^\dagger</math> is the Moore-Penrose [[pseudoinverse]].  For square matrices, this unfortunately makes the condition number discontinuous, but it is a useful definition for rectangular matrices, which are never invertible but are still used to define systems of equations.

== Nonlinear ==
Condition numbers can also be defined for nonlinear functions, and can be computed using [[calculus]]. The condition number varies with the point; in some cases one can use the maximum (or [[supremum]]) condition number over the [[domain of a function|domain]] of the function or domain of the question as an overall condition number, while in other cases the condition number at a particular point is of more interest.

=== One variable ===

The ''absolute'' condition number of a [[differentiable function]] <math>f</math> in one variable is the [[absolute value]] of the [[derivative]] of the function:

: <math>\left|f'(x)\right|</math>

The ''relative'' condition number of <math>f</math> as a function is <math>\left|xf'/f\right|</math>. Evaluated at a point <math>x</math>, this is

: <math>\left|\frac{xf'(x)}{f(x)}\right|=\left|\frac{(\log f)'}{(\log x)'}\right|.</math>

Note that this is the absolute value of the [[Elasticity (economics)|elasticity]] of a function in economics.

Most elegantly, this can be understood as (the absolute value of) the ratio of the [[logarithmic derivative]] of <math>f</math>, which is <math>(\log f)' = f'/f</math>, and the logarithmic derivative of <math>x</math>, which is <math>(\log x)' = x'/x = 1/x</math>, yielding a ratio of <math>xf'/f</math>. This is because the logarithmic derivative is the [[Infinitesimal calculus|infinitesimal]] rate of relative change in a function: it is the derivative <math>f'</math> scaled by the value of <math>f</math>. Note that if a function has a [[zero of a function|zero]] at a point, its condition number at the point is infinite, as infinitesimal changes in the input can change the output from zero to positive or negative, yielding a ratio with zero in the denominator, hence infinite relative change.

More directly, given a small change <math>\Delta x</math> in <math>x</math>, the relative change in <math>x</math> is <math>[(x + \Delta x) - x] / x = (\Delta x) / x</math>, while the relative change in <math>f(x)</math> is <math>[f(x + \Delta x) - f(x)] / f(x)</math>. Taking the ratio yields

: <math>\frac{[f(x + \Delta x) - f(x)] / f(x)}{(\Delta x) / x} = \frac{x}{f(x)} \frac{f(x + \Delta x) - f(x)}{(x + \Delta x) - x} = \frac{x}{f(x)} \frac{f(x + \Delta x) - f(x)}{\Delta x}.</math>

The last term is the [[difference quotient]] (the slope of the [[secant line]]), and taking the [[limit (mathematics)|limit]] yields the derivative.

Condition numbers of common [[elementary function]]s are particularly important in computing [[significant figures]] and can be computed immediately from the derivative. A few important ones are given below:
{| class="wikitable"
! Name || Symbol || Relative condition number
|-
| Addition / subtraction || <math>x + a</math> || <math>\left|\frac{x}{x+a}\right|</math>
|-
| Scalar multiplication || <math>a x</math> || <math>1</math>
|-
| Division || <math>1 / x</math> || <math>1</math>
|-
| [[Polynomial]] || <math>x^n</math> || <math>|n|</math>
|-
| [[Exponential function]] || <math>e^x</math> || <math>|x|</math>
|-
| [[Natural logarithm]] function || <math>\ln(x)</math> || <math>\left|\frac{1}{\ln(x)}\right|</math>
|-
| Sine function || <math>\sin(x)</math> || <math>|x\cot(x)|</math>
|-
| Cosine function || <math>\cos(x)</math> || <math>|x\tan(x)|</math>
|-
| Tangent function || <math>\tan(x)</math> || <math>|x(\tan(x)+\cot(x))|</math>
|-
| Inverse sine function || <math>\arcsin(x)</math> || <math>\frac{x}{\sqrt{1-x^2}\arcsin(x)}</math>
|-
| Inverse cosine function || <math>\arccos(x)</math> || <math>\frac{|x|}{\sqrt{1-x^2}\arccos(x)}</math>
|-
| Inverse tangent function || <math>\arctan(x)</math> || <math>\frac{x}{(1+x^2)\arctan(x)}</math>
|}

=== Several variables ===

Condition numbers can be defined for any function <math>f</math> mapping its data from some [[domain of a function|domain]] (e.g. an <math>m</math>-tuple of [[real number]]s <math>x</math>) into some [[codomain]] (e.g. an <math>n</math>-tuple of real numbers <math>f(x)</math>), where both the domain and codomain are [[Banach space]]s. They express how sensitive that function is to small changes (or small errors) in its arguments. This is crucial in assessing the sensitivity and potential accuracy difficulties of numerous computational problems, for example, [[Root-finding algorithms#Roots of polynomials|polynomial root finding]] or computing [[eigenvalue]]s.

The condition number of <math>f</math> at a point <math>x</math> (specifically, its '''relative condition number'''<ref name=TrefethenBau />) is then defined to be the maximum ratio of the fractional change in <math>f(x)</math> to any fractional change in <math>x</math>, in the limit where the change <math>\delta x</math> in <math>x</math> becomes infinitesimally small:<ref name=TrefethenBau>{{cite book |isbn= 978-0-89871-361-9 |first1=L. N. |last1= Trefethen |first2= D. |last2=Bau |title=Numerical Linear Algebra |publisher=SIAM |year=1997 |url=https://books.google.com/books?id=JaPtxOytY7kC&q=978-0898713619}}</ref>

: <math>\lim_{\varepsilon \to 0^+} \sup_{\|\delta x\| \leq \varepsilon} \left[ \left. \frac{\left\|f(x + \delta x) - f(x)\right\|}{\|f(x)\|} \right/ \frac{\|\delta x\|}{\|x\|} \right],</math>

where <math>\|\cdot\|</math> is a [[Norm (mathematics)|norm]] on the domain/codomain of <math>f</math>.

If <math>f</math> is differentiable, this is equivalent to:<ref name=TrefethenBau />

: <math>\frac{\|J(x)\|}{\|f(x) \| / \|x\|},</math>

where {{tmath|J(x)}} denotes the [[Jacobian matrix]] of [[partial derivative]]s of <math>f</math> at <math>x</math>, and <math>\|J(x)\|</math> is the [[induced norm]] on the matrix.

== See also ==
* [[Numerical methods for linear least squares]]
* [[Numerical stability]]
* [[Hilbert matrix]]
* [[Ill-posed problem]]
* [[Singular value]]
* [[Wilson matrix]]

== References ==
{{Reflist}}

== Further reading ==
* {{cite book |first=James |last=Demmel |author-link=James Demmel |chapter=Nearest Defective Matrices and the Geometry of Ill-conditioning |pages=35–55 |title=Reliable Numerical Computation |editor-first=M. G. |editor-last=Cox |editor2-first=S. |editor2-last=Hammarling |location=Oxford |publisher=Clarendon Press |year=1990 |isbn=0-19-853564-3 }}

== External links ==
* [https://web.archive.org/web/20070121001740/http://numericalmethods.eng.usf.edu/mws/gen/04sle/mws_gen_sle_spe_adequacy.pdf Condition Number of a Matrix] at ''Holistic Numerical Methods Institute''
* [http://www.mathworks.in/help/techdoc/ref/cond.html MATLAB library function to determine condition number]
* [https://www.encyclopediaofmath.org/index.php/Condition_number Condition number – Encyclopedia of Mathematics]
* [https://nhigham.com/2019/01/23/who-invented-the-matrix-condition-number/ Who Invented the Matrix Condition Number? by Nick Higham]

[[Category:Numerical analysis]]
[[Category:Matrices (mathematics)]]