Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Kolmogorov–Smirnov test
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Two-sample Kolmogorov–Smirnov test== [[File:KS2 Example.png|thumb|300px|Illustration of the two-sample Kolmogorov–Smirnov statistic. Red and blue lines each correspond to an empirical distribution function, and the black arrow is the two-sample KS statistic.]] The Kolmogorov–Smirnov test may also be used to test whether two underlying one-dimensional probability distributions differ. In this case, the Kolmogorov–Smirnov statistic is <math display="block">D_{n,m}=\sup_x |F_{1,n}(x)-F_{2,m}(x)|,</math> where <math>F_{1,n}</math> and <math>F_{2,m}</math> are the [[empirical distribution function]]s of the first and the second sample respectively, and <math>\sup</math> is the [[Infimum and supremum|supremum function]]. For large samples, the null hypothesis is rejected at level <math>\alpha</math> if <math display="block">D_{n,m}>c(\alpha)\sqrt{\frac{n + m}{n\cdot m}}.</math> Where <math>n</math> and <math>m</math> are the sizes of first and second sample respectively. The value of <math>c({\alpha})</math> is given in the table below for the most common levels of <math>\alpha</math> {| class="wikitable" |- ! <math>\alpha</math> | 0.20 || 0.15 || 0.10 || 0.05 || 0.025 || 0.01 || 0.005 || 0.001 |- ! <math>c({\alpha})</math> | 1.073 || 1.138 || 1.224 || 1.358 || 1.48 || 1.628 || 1.731 || 1.949 |} and in general<ref>Eq. (15) in Section 3.3.1 of Knuth, D.E., The Art of Computer Programming, Volume 2 (Seminumerical Algorithms), 3rd Edition, Addison Wesley, Reading Mass, 1998.</ref> by <math display="block">c\left(\alpha\right)=\sqrt{-\ln\left(\tfrac{\alpha}{2}\right)\cdot \tfrac{1}{2}},</math> so that the condition reads <math display="block">D_{n,m}>\sqrt{-\ln\left(\tfrac{\alpha}{2}\right)\cdot \tfrac{1 + \tfrac{m}{n}}{2m}}.</math> Here, again, the larger the sample sizes, the more sensitive the minimal bound: For a given ratio of sample sizes (e.g. <math>m=n</math>), the minimal bound scales in the size of either of the samples according to its inverse square root. Note that the two-sample test checks whether the two data samples come from the same distribution. This does not specify what that common distribution is (e.g. whether it's normal or not normal). Again, tables of critical values have been published. A shortcoming of the univariate Kolmogorov–Smirnov test is that it is not very powerful because it is devised to be sensitive against all possible types of differences between two distribution functions. Some argue<ref>{{cite journal |last1=Marozzi |first1=Marco |title=Some Notes on the Location-Scale Cucconi Test |journal=Journal of Nonparametric Statistics |date=2009 |volume=21 |issue=5 |pages=629–647 |doi=10.1080/10485250902952435 |s2cid=120038970 }}</ref><ref>{{cite journal |last1=Marozzi |first1=Marco |title=Nonparametric Simultaneous Tests for Location and Scale Testing: a Comparison of Several Methods |journal=Communications in Statistics – Simulation and Computation |date=2013 |volume=42 |issue=6 |pages=1298–1317 |doi=10.1080/03610918.2012.665546 |s2cid=28146102 }}</ref> that the [[Cucconi test]], originally proposed for simultaneously comparing location and scale, can be much more powerful than the Kolmogorov–Smirnov test when comparing two distribution functions. Two-sample KS tests have been applied in economics to detect asymmetric effects and to study natural experiments.<ref>{{cite journal |last1=Monge |first1=Marco |title=Two-Sample Kolmogorov-Smirnov Tests as Causality Tests. A narrative of Latin American inflation from 2020 to 2022. |date=2023 |volume=17 |issue=1 |pages=68–78 |url=https://rches.utem.cl/articulos/two-sample-kolmogorov-smirnov-tests-as-causality-tests-a-narrative-of-latin-american-inflation-from-2020-to-2022/|journal=Revista Chilena de Economía y Sociedad }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)