Editing Mahalanobis distance (section)

== Other forms of multivariate location and scatter ==
[[File:Mahalanobis-distance-location-and-scatter-methods.png|thumb|620x620px|Hypothetical two-dimensional example of Mahalanobis distance with three different methods of defining the multivariate location and scatter of the data.]]
The sample mean and covariance matrix can be quite sensitive to outliers, therefore other approaches for calculating the multivariate location and scatter of data are also commonly used when calculating the Mahalanobis distance.  The Minimum Covariance Determinant approach estimates multivariate location and scatter from a subset numbering <math>h</math> data points that has the smallest variance-covariance matrix determinant.<ref>{{Cite journal|last1=Hubert|first1=Mia|last2=Debruyne|first2=Michiel|date=2010|title=Minimum covariance determinant|url=https://onlinelibrary.wiley.com/doi/10.1002/wics.61|journal=WIREs Computational Statistics|language=en|volume=2|issue=1|pages=36–43|doi=10.1002/wics.61|s2cid=123086172 |issn=1939-5108|url-access=subscription}}</ref>  The Minimum Volume Ellipsoid approach is similar to the Minimum Covariance Determinant approach in that it works with a subset of size <math>h</math> data points, but the Minimum Volume Ellipsoid estimates multivariate location and scatter from the ellipsoid of minimal volume that encapsulates the <math>h</math> data points.<ref>{{Cite journal|last1=Van Aelst|first1=Stefan|last2=Rousseeuw|first2=Peter|date=2009|title=Minimum volume ellipsoid|url=https://onlinelibrary.wiley.com/doi/10.1002/wics.19|journal=Wiley Interdisciplinary Reviews: Computational Statistics|language=en|volume=1|issue=1|pages=71–82|doi=10.1002/wics.19|s2cid=122106661 |issn=1939-5108|url-access=subscription}}</ref>  Each method varies in its definition of the distribution of the data, and therefore produces different Mahalanobis distances.  The Minimum Covariance Determinant and Minimum Volume Ellipsoid approaches are more robust to samples that contain outliers, while the sample mean and covariance matrix tends to be more reliable with small and biased data sets.<ref>{{Cite journal|last=Etherington|first=Thomas R.|date=2021-05-11|title=Mahalanobis distances for ecological niche modelling and outlier detection: implications of sample size, error, and bias for selecting and parameterising a multivariate location and scatter method|journal=PeerJ|language=en|volume=9|pages=e11436|doi=10.7717/peerj.11436|issn=2167-8359|pmc=8121071|pmid=34026369 |doi-access=free }}</ref>