Editing Mann–Whitney U test (section)

==U statistic ==
Let <math>X_1,\ldots, X_{n_1}</math> be group 1, an [[Independent and identically distributed random variables|i.i.d. sample]] from <math>X</math>, and <math>Y_1,\ldots, Y_{n_2}</math> be group 2, an i.i.d. sample from <math>Y</math>, and let both samples be independent of each other. The corresponding ''Mann–Whitney [[U statistic]]'' is defined as the smaller of:

:<math>U_1 = n_1 n_2 + \tfrac{n_1(n_1 + 1)}{2} - R_1,
U_2 = n_1 n_2 + \tfrac{n_2(n_2 + 1)}{2} - R_2</math>

with 
:<math>R_1, R_2 </math> being the sums of the ranks in groups 1 and 2, after ranking all samples from both groups such that the smallest value obtains rank 1 and the largest rank <math>n_1+n_2</math>. <ref>[https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_nonparametric/bs704_nonparametric4.html Boston University (SPH), 2017]</ref>

=== Area-under-curve (AUC) statistic for ROC curves ===
The ''U'' statistic is related to the '''area under the [[receiver operating characteristic]] curve'''  ([[Receiver operating characteristic#Area under the curve|AUC]]):<ref>{{cite journal | vauthors=((Mason, S. J.)), ((Graham, N. E.)) | journal=Quarterly Journal of the Royal Meteorological Society | title=Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation | volume=128 | issue=584 | pages=2145–2166 | date= 2002 | issn=1477-870X | doi=10.1256/003590002320603584}}</ref>
:<math>\mathrm{AUC}_1 = {U_1 \over n_1n_2}</math>

Note that this is the same definition as the [[common language effect size]], i.e. the probability that a classifier will rank a randomly chosen instance from the first group higher than a randomly chosen instance from the second group.<ref name="fawcett">Fawcett, Tom (2006); ''[https://www.math.ucdavis.edu/~saito/data/roc/fawcett-roc.pdf An introduction to ROC analysis]'', Pattern Recognition Letters, 27, 861–874.</ref>

Because of its probabilistic form, the ''U'' statistic can be generalized to a measure of a classifier's separation power for more than two classes:<ref>{{cite journal |last1=Hand |first1=David&nbsp;J. |last2=Till |first2=Robert&nbsp;J. |year=2001 |title=A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems |journal=Machine Learning |volume=45 |pages=171–186 |doi=10.1023/A:1010920819831 |doi-access=free |number=2}}</ref>
:<math>M = {1 \over c(c-1)} \sum \mathrm{AUC}_{k,\ell}</math>
Where ''c'' is the number of classes, and the ''R''<sub>''k'',''ℓ''</sub> term of AUC<sub>''k'',''ℓ''</sub> considers only the ranking of the items belonging to classes ''k'' and ''ℓ'' (i.e., items belonging to all other classes are ignored) according to the classifier's estimates of the probability of those items belonging to class ''k''. AUC<sub>''k'',''k''</sub> will always be zero but, unlike in the two-class case, generally {{math|1=AUC<sub>''k'',''ℓ''</sub> ≠ AUC<sub>''ℓ'',''k''</sub>}}, which is why the ''M'' measure sums over all (''k'',''ℓ'') pairs, in effect using the average of AUC<sub>''k'',''ℓ''</sub> and AUC<sub>''ℓ'',''k''</sub>.