Editing Standard score (section)

== Applications ==
=== Z-test===
{{main article|Z-test}}
The z-score is often used in the z-test in standardized testing – the analog of the [[Student's t-test]] for a population whose parameters are known, rather than estimated. As it is very unusual to know the entire population, the t-test is much more widely used.

=== Prediction intervals===
{{anchor|prediction intervals}}
The standard score can be used in the calculation of [[prediction interval]]s. A prediction interval [''L'',''U''], consisting of a lower endpoint designated ''L'' and an upper endpoint designated ''U'', is an interval such that a future observation ''X'' will lie in the interval with high probability <math>\gamma</math>, i.e.

:<math>P(L<X<U) =\gamma,</math>

For the standard score ''Z'' of ''X'' it gives:<ref>{{cite book |author=E. Kreyszig |author-link=Erwin Kreyszig |edition=Fourth |year=1979 |title=Advanced Engineering Mathematics |publisher=Wiley |isbn=0-471-02140-7 |page=880, eq. 6}}</ref>
:<math>P\left( \frac{L-\mu}{\sigma} < Z < \frac{U-\mu}{\sigma} \right) = \gamma.</math>
By determining the quantile z such that
:<math>P\left( -z < Z < z \right) = \gamma</math>
it follows:
:<math>L=\mu-z\sigma,\ U=\mu+z\sigma</math>

=== Process control===
In process control applications, the Z value provides an assessment of the degree to which a process is operating off-target.

=== Comparison of scores measured on different scales: ACT and SAT ===
[[File:Z score for Students A.png|thumb|224x224px|The ''z'' score for Student A was 1, meaning Student A was 1 standard deviation above the mean. Thus, Student A performed in the 84.13 percentile on the SAT. ]]
When scores are measured on different scales, they may be converted to z-scores to aid comparison. Dietz et al.<ref name="Diez2012">{{Citation
|last1= Diez
|first1= David
|last2= Barr
|first2= Christopher
|last3= Çetinkaya-Rundel
|first3= Mine
|title= OpenIntro Statistics
|edition=Second
|year=2012
|publisher= openintro.org
|url=https://www.openintro.org/stat/textbook.php?stat_book=os
}}
</ref> give the following example, comparing student scores on the (old) [[SAT]] and [[ACT (test)|ACT]] high school tests. The table shows the mean and standard deviation for total scores on the SAT and ACT. Suppose that student A scored 1800 on the SAT, and student B scored 24 on the ACT. Which student performed better relative to other test-takers?

{| class="wikitable"
|-
!
! SAT
! ACT
|-
! Mean
| 1500
| 21
|-
! Standard deviation
| 300
| 5
|}
[[File:Z score for Student B.png|thumb|The ''z'' score for Student B was 0.6, meaning Student B was 0.6 standard deviation above the mean. Thus, Student B performed in the 72.57 percentile on the SAT. ]]
The z-score for student A is <math>z = {x- \mu \over \sigma} = {1800- 1500 \over 300} = 1 </math>

The z-score for student B is <math>z = {x- \mu \over \sigma} = {24- 21 \over 5} = 0.6 </math>

Because student A has a higher z-score than student B, student A performed better compared to other test-takers than did student B.

=== Percentage of observations below a z-score ===
Continuing the example of ACT and SAT scores, if it can be further assumed that both ACT and SAT scores are [[Normal distribution|normally distributed]] (which is approximately correct), then the z-scores may be used to calculate the percentage of test-takers who received lower scores than students A and B.

=== Cluster analysis and multidimensional scaling ===
"For some multivariate techniques such as multidimensional scaling and cluster analysis, the concept of distance between the units in the data is often of considerable interest and importance… When the variables in a multivariate data set are on different scales, it makes more sense to calculate the distances after some form of standardization."<ref name="EverittHothorn2011 ">{{Citation |last1= Everitt |first1= Brian |last2= Hothorn |first2= Torsten J |title= An Introduction to Applied Multivariate Analysis with R |year=2011|publisher= Springer
|isbn= 978-1441996497 }} </ref>

===Principal components analysis===
In principal components analysis, "Variables measured on different scales or on a common scale with widely differing ranges are often standardized."<ref name="JohnsonWichern2007">{{Citation |last1= Johnson |first1= Richard |last2= Wichern |first2= Wichern |title= Applied Multivariate Statistical Analysis |year=2007|publisher= Pearson / Prentice Hall}}</ref>

=== Relative importance of variables in multiple regression: standardized regression coefficients ===
Standardization of variables prior to [[multiple regression analysis]] is sometimes used as an aid to interpretation.<ref name="AfifiMayClark2012">{{Citation |last1= Afifi |first1= Abdelmonem |last2= May |first2= Susanne K. |last3= Clark |first3= Virginia A. |title= Practical Multivariate Analysis
|edition= Fifth |year=2012 |publisher= Chapman & Hall/CRC |isbn= 978-1439816806}}</ref>
(page 95) state the following.

"The standardized regression slope is the slope in the regression equation if X and Y are standardized … Standardization of X and Y is done by subtracting the respective means from each set of observations and dividing by the respective standard deviations … In multiple regression, where several X variables are used, the standardized regression coefficients quantify the relative contribution of each X variable."

However, Kutner et al.<ref name="KutnerNachtsheim2004">{{Citation |last1= Kutner |first1= Michael |last2= Nachtsheim |first2= Christopher |last3= Neter |first3= John |title= Applied Linear Regression Models |edition= Fourth |year=204 |publisher= McGraw Hill|isbn= 978-0073014661 }}</ref> (p 278) give the following caveat: "… one must be cautious about interpreting any regression coefficients, whether standardized or not. The reason is that when the predictor variables are correlated among themselves, … the regression coefficients are affected by the other predictor variables in the model … The magnitudes of the standardized regression coefficients are affected not only by the presence of correlations among the predictor variables but also by the spacings of the observations on each of these variables. Sometimes these spacings may be quite arbitrary. Hence, it is ordinarily not wise to interpret the magnitudes of standardized regression coefficients as reflecting the comparative importance of the predictor variables."