Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Mahalanobis distance
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Definition== Given a probability distribution <math>Q</math> on <math>\R^N</math>, with mean <math>\vec{\mu} = (\mu_1, \mu_2, \mu_3, \dots , \mu_N)^\mathsf{T}</math> and positive semi-definite [[covariance matrix]] <math>\mathbf{\Sigma}</math>, the Mahalanobis distance of a point <math>\vec{x} = (x_1, x_2, x_3, \dots, x_N )^\mathsf{T}</math> from <math>Q</math> is <ref>{{Cite journal |last1=De Maesschalck |first1=R. |last2=Jouan-Rimbaud |first2=D. |last3=Massart |first3=D. L. |title=The Mahalanobis distance |journal=Chemometrics and Intelligent Laboratory Systems |year=2000 |volume=50 |issue=1 |pages=1β18 |doi=10.1016/s0169-7439(99)00047-7}}</ref><math display="block">d_M(\vec{x}, Q) = \sqrt{(\vec{x} - \vec{\mu})^\mathsf{T} \mathbf{\Sigma}^{-1} (\vec{x} - \vec{\mu})}.</math>Given two points <math>\vec{x}</math> and <math>\vec{y}</math> in <math>\R^N</math>, the Mahalanobis distance between them with respect to <math>Q</math> is<math display="block"> d_M(\vec{x} ,\vec{y}; Q) = \sqrt{(\vec{x} - \vec{y})^\mathsf{T} \mathbf{\Sigma}^{-1} (\vec{x} - \vec{y})}.</math>which means that <math>d_M(\vec{x}, Q) = d_M(\vec{x},\vec{\mu}; Q)</math>. Since <math>\mathbf{\Sigma}</math> is [[Positive semidefinite matrices|positive semi-definite]], so is <math>\mathbf{\Sigma}^{-1}</math>, thus the square roots are always defined. We can find useful decompositions of the squared Mahalanobis distance that help to explain some reasons for the outlyingness of multivariate observations and also provide a graphical tool for identifying outliers.<ref>{{Cite journal |last=Kim |first=M. G. |year=2000 |title=Multivariate outliers and decompositions of Mahalanobis distance |journal=Communications in Statistics β Theory and Methods |volume=29 |issue=7 |pages=1511β1526 |doi=10.1080/03610920008832559|s2cid=218567835 }}</ref> By the [[spectral theorem]], <math>\mathbf{\Sigma}</math> can be decomposed as <math> \mathbf{\Sigma} = \mathbf{S}^T \mathbf{S}</math> for some real <math> N\times N</math> matrix. One choice for <math>\mathbf{S}</math> is the symmetric square root of <math>\mathbf{\Sigma}</math>, which is the [[Standard deviation#Standard deviation matrix|standard deviation matrix]].<ref name="Das">{{cite arXiv |eprint=2012.14331 |last1=Das |first1=Abhranil |author2=Wilson S Geisler |title=Methods to integrate multinormals and compute classification measures |date=2020 |class=stat.ML }}</ref> This gives us the equivalent definition<math display="block">d_M(\vec{x}, \vec{y}; Q) = \|\mathbf{S}^{-1}(\vec{x} - \vec{y})\|</math>where <math>\|\cdot\|</math> is the Euclidean norm. That is, the Mahalanobis distance is the Euclidean distance after a [[whitening transformation]]. The existence of <math>\mathbf{S}</math> is guaranteed by the spectral theorem, but it is not unique. Different choices have different theoretical and practical advantages.<ref>{{Cite journal |last1=Kessy |first1=Agnan |last2=Lewin |first2=Alex |last3=Strimmer |first3=Korbinian |date=2018-10-02 |title=Optimal Whitening and Decorrelation |url=https://doi.org/10.1080/00031305.2016.1277159 |journal=The American Statistician |volume=72 |issue=4 |pages=309β314 |doi=10.1080/00031305.2016.1277159 |s2cid=55075085 |issn=0003-1305|arxiv=1512.00809 }}</ref> In practice, the distribution <math>Q</math> is usually the [[sample distribution]] from a set of [[Independent and identically distributed random variables|IID]] samples from an underlying unknown distribution, so <math>\mu</math> is the sample mean, and <math>\mathbf{\Sigma}</math> is the covariance matrix of the samples. When the [[affine span]] of the samples is not the entire <math>\R^N</math>, the covariance matrix would not be positive-definite, which means the above definition would not work. However, in general, the Mahalanobis distance is preserved under any full-rank affine transformation of the affine span of the samples. So in case the affine span is not the entire <math>\R^N</math>, the samples can be first orthogonally projected to <math>\R^n</math>, where <math>n</math> is the dimension of the affine span of the samples, then the Mahalanobis distance can be computed as usual.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)