Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Mann–Whitney U test
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Normal approximation and tie correction== For large samples, ''U'' is approximately [[normal distribution|normally distributed]]. In that case, the [[standard score|standardized value]] :<math>z = \frac{ U - m_U }{ \sigma_U }, \, </math> where ''m''<sub>''U''</sub> and ''σ''<sub>''U''</sub> are the mean and standard deviation of ''U'', is approximately a standard normal deviate whose significance can be checked in tables of the normal distribution. ''m''<sub>''U''</sub> and ''σ''<sub>''U''</sub> are given by :<math>m_U = \frac{n_1 n_2}{2}, \, </math> <ref name="auto">{{cite book |last1=Siegal |first1=Sidney (1956) |title=Nonparametric statistics for the behavioral sciences |publisher=McGraw-Hill |page=121}}</ref> and :<math>\sigma_U=\sqrt{n_1 n_2 (n_1 + n_2+1) \over 12}. \, </math> <ref name="auto"/> The formula for the standard deviation is more complicated in the presence of tied ranks. If there are ties in ranks, ''σ'' should be adjusted as follows: :<math> \sigma_\text{ties}=\sqrt{ {n_1 n_2 (n_1 + n_2 +1) \over 12 } - { n_1 n_2 \sum_{k=1}^K (t_k^3 - t_k) \over 12 n(n-1) } },\, </math> <ref>{{cite book |last1=Lehmann |first1=Erich | last2=D'Abrera | first2=Howard (1975) |title=Nonparametrics: Statistical Methods Based on Ranks |publisher=Holden-Day |page=20}}</ref> where the left side is simply the variance and the right side is the adjustment for ties, ''t''<sub>''k''</sub> is the number of ties for the ''k''th rank, and ''K'' is the total number of unique ranks with ties. A more computationally-efficient form with {{math|1=''n''<sub>1</sub>''n''<sub>2</sub>/12}} factored out is :<math> \sigma_\text{ties}=\sqrt{ {n_1 n_2 \over 12 } \left( (n+1) - { \sum_{k=1}^K (t_k^3 - t_k) \over n(n-1)} \right)},</math> where {{math|1=''n'' = ''n''<sub>1</sub> + ''n''<sub>2</sub>}}. If the number of ties is small (and especially if there are no large tie bands) ties can be ignored when doing calculations by hand. The computer statistical packages will use the correctly adjusted formula as a matter of routine. Note that since {{math|1=''U''<sub>1</sub> + ''U''<sub>2</sub> = ''n''<sub>1</sub>''n''<sub>2</sub>}}, the mean {{math|1=''n''<sub>1</sub>''n''<sub>2</sub>/2}} used in the normal approximation is the mean of the two values of ''U''. Therefore, the absolute value of the ''z''-statistic calculated will be same whichever value of ''U'' is used.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)