Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Mann–Whitney U test
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Calculations== The test involves the calculation of a [[statistic]], usually called ''U'', whose distribution under the [[null hypothesis]] is known: * In the case of small samples, the distribution is tabulated * For sample sizes above ~20, approximation using the [[normal distribution]] is fairly good. Alternatively, the null distribution can be approximated using [[permutation test]]s and Monte Carlo simulations. Some books tabulate statistics equivalent to ''U'', such as the sum of ranks in one of the samples, rather than ''U'' itself. The Mann–Whitney ''U'' test is included in most [[List of statistical packages|statistical packages]]. It is also easily calculated by hand, especially for small samples. There are two ways of doing this. '''Method one:''' For comparing two small sets of observations, a direct method is quick, and gives insight into the meaning of the ''U'' statistic, which corresponds to the number of wins out of all pairwise contests (see the tortoise and hare example under Examples below). For each observation in one set, count the number of times this first value wins over any observations in the other set (the other value loses if this first is larger). Count 0.5 for any ties. The sum of wins and ties is ''U'' (i.e.: <math>U_1</math>) for the first set. ''U'' for the other set is the converse (i.e.: <math>U_2</math>). '''Method two:''' For larger samples: # Assign numeric ranks to all the observations (put the observations from both groups to one set), beginning with 1 for the smallest value. Where there are groups of tied values, assign a rank equal to the midpoint of unadjusted rankings (e.g., the ranks of {{math|(3, 5, 5, 5, 5, 8)}} are {{math|(1, 3.5, 3.5, 3.5, 3.5, 6)}}, where the unadjusted ranks would be {{math|(1, 2, 3, 4, 5, 6)}}). # Now, add up the ranks for the observations which came from sample 1. The sum of ranks in sample 2 is now determined, since the sum of all the ranks equals {{math|''N''(''N'' + 1)/2}} where ''N'' is the total number of observations. # ''U'' is then given by:<ref>{{cite book|last=Zar|first=Jerrold H.|title=Biostatistical Analysis|year=1998|publisher=Prentice Hall International, INC.|location=New Jersey|isbn=978-0-13-082390-8|page=147}}</ref> :::<math>U_1=R_1 - {n_1(n_1+1) \over 2} \,\!</math> ::where ''n''<sub>1</sub> is the sample size for sample 1, and ''R''<sub>1</sub> is the sum of the ranks in sample 1. ::Note that it doesn't matter which of the two samples is considered sample 1. An equally valid formula for ''U'' is :::<math>U_2= R_2 - {n_2(n_2+1) \over 2} \,\!</math> ::The smaller value of ''U''<sub>1</sub> and ''U''<sub>2</sub> is the one used when consulting significance tables. The sum of the two values is given by :::<math>U_1 + U_2 = R_1 - {n_1(n_1+1) \over 2} + R_2 - {n_2(n_2+1) \over 2}. \,\!</math> :: Knowing that {{math|1=''R''<sub>1</sub> + ''R''<sub>2</sub> = ''N''(''N'' + 1)/2}} and {{math|1=''N'' = ''n''<sub>1</sub> + ''n''<sub>2</sub>}}, and doing some [[algebra]], we find that the sum is :::{{math|1=''U''<sub>1</sub> + ''U''<sub>2</sub> = ''n''<sub>1</sub>''n''<sub>2</sub>}}.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)