Editing Quantile (section)

==Estimating quantiles from a sample==
One problem which frequently arises is estimating a quantile of a (very large or infinite) population based on a finite sample of size {{mvar|N}}.

Modern statistical packages rely on a number of techniques to [[Estimation theory|estimate]] the quantiles.

[[Rob J. Hyndman|Hyndman]] and Fan compiled a [[Taxonomy (general)|taxonomy]] of nine algorithms<ref>{{cite journal |last1=Hyndman |first1=Rob J. |author-link1=Rob J. Hyndman |last2=Fan |first2=Yanan |title=Sample Quantiles in Statistical Packages |journal=American Statistician |date=November 1996 |volume=50 |issue=4 |pages=361–365 |doi=10.2307/2684934 |jstor=2684934 |publisher=American Statistical Association |url=https://www.researchgate.net/publication/222105754}}</ref> used by various software packages.  
All methods compute {{mvar|Q<sub>p</sub>}}, the estimate for the {{mvar|p}}-quantile (the {{mvar|k}}-th {{mvar|q}}-quantile, where {{math|''p'' {{=}} ''k''/''q''}}) from a sample of size {{mvar|N}} by computing a real valued index {{mvar|h}}.  When {{mvar|h}} is an integer, the {{mvar|h}}-th smallest of the {{mvar|N}} values, {{mvar|x<sub>h</sub>}}, is the quantile estimate.  Otherwise a rounding or interpolation scheme is used to compute the quantile estimate from {{mvar|h}}, {{math|''x''<sub>⌊''h''⌋</sub>}}, and {{math|''x''<sub>⌈''h''⌉</sub>}}.  (For notation, see [[floor and ceiling functions]]).

The first three are piecewise constant, changing abruptly at each data point, while the last six use linear interpolation between data points, and differ only in how the index {{mvar|h}} used to choose the point along the piecewise linear interpolation curve, is chosen.

[[Mathematica]],<ref>[http://reference.wolfram.com/language/ref/Quantile.html#DetailsAndOptions Mathematica Documentation] See 'Details' section</ref> [[Matlab]],<ref>{{Cite web|url=https://uk.mathworks.com/matlabcentral/fileexchange/46555-quantile-calculation|title=Quantile calculation|website=uk.mathworks.com}}</ref> [[R (programming language)|R]]<ref>{{cite book |last1=Frohne |first1=Ivan |last2=Hyndman |first2=Rob J. |author-link2=Rob J. Hyndman |title=Sample Quantiles |publisher=R Project |url=http://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html |isbn=978-3-900051-07-5 |year=2009}}</ref> and [[GNU Octave]]<ref name="Function Reference: quantile – Octave-Forge – SourceForge">{{cite web |title=Function Reference: quantile – Octave-Forge – SourceForge |url=http://octave.sourceforge.net/octave/function/quantile.html|access-date=6 September 2013}}</ref> programming languages support all nine sample quantile methods. [[SAS (software)|SAS]] includes five sample quantile methods, [[SciPy]]<ref>{{Cite web|url=https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.mquantiles.html|title=scipy.stats.mstats.mquantiles — SciPy v1.4.1 Reference Guide|website=docs.scipy.org}}</ref> and [[Maple (software)|Maple]]<ref>{{Cite web|url=https://www.maplesoft.com/support/help/maple/view.aspx?path=Statistics/Quantile|title=Statistics – Maple Programming Help|website=www.maplesoft.com}}</ref> both include eight, [[EViews]]<ref>{{cite web|url=http://www.eviews.com/help/EViews%209%20Help/graphs.020.09.html#ww140852 |title=EViews 9 Help |access-date=April 4, 2016 |url-status=dead |archive-url=https://web.archive.org/web/20160416123322/http://www.eviews.com/help/EViews%209%20Help/graphs.020.09.html |archive-date=April 16, 2016 }}</ref> and [[Julia (programming language)|Julia]]<ref>{{cite web|url=https://docs.julialang.org/en/v1/stdlib/Statistics/#Statistics.quantile |title=Statistics – Julia Documentation |access-date=June 17, 2023}}</ref> include the six piecewise linear functions, [[Stata]]<ref>[https://www.stata.com/manuals/dpctile.pdf Stata documentation for the pctile and xtile commands] See 'Methods and formulas' section.</ref>  includes two, [[Python (programming language)|Python]]<ref>{{Cite web|url=https://docs.python.org/3/library/statistics.html#statistics.quantiles|title=statistics — Mathematical statistics functions — Python 3.8.3rc1 documentation|website=docs.python.org}}</ref> includes two, and [[Microsoft Excel]] includes two. Mathematica, SciPy and Julia support arbitrary parameters for methods which allow for other, non-standard, methods.

The estimate types and interpolation schemes used include:

{| class="wikitable"
|-
! Type
! {{mvar|h}}
! {{mvar|Q<sub>p</sub>}}
! Notes
|-
| R‑1, SAS‑3, Maple‑1
| {{mvar|Np}}
| {{math|''x''<sub>⌈''h''⌉</sub>}}
| Inverse of [[empirical distribution function]].
|-
| R‑2, SAS‑5, Maple‑2, Stata
| {{math|''Np'' + 1/2}}
| {{math|(''x''<sub>⌈''h'' – 1/2⌉</sub> + ''x''<sub>⌊''h'' + 1/2⌋</sub>) / 2}}
| The same as R-1, but with averaging at discontinuities. 
|-
| R‑3, SAS‑2
| {{math|''Np'' − 1/2}}
| {{math|''x''<sub>⌊''h''⌉</sub>}}
| The observation numbered closest to {{mvar|Np}}.  Here, {{math|⌊''h''⌉}} indicates rounding to the nearest integer, [[Rounding#Round half to even|choosing the even integer in the case of a tie]].
|-
| R‑4, SAS‑1, SciPy‑(0,1), Julia‑(0,1), Maple‑3
| {{mvar|Np}}
|rowspan=6| {{math|''x''<sub>⌊''h''⌋</sub> + (''h'' − ⌊''h''⌋) (''x''<sub>⌈''h''⌉</sub> − ''x''<sub>⌊''h''⌋</sub>)}}
| Linear interpolation of the inverse of the empirical distribution function.
|-
| R‑5, SciPy‑(1/2,1/2), Julia‑(1/2,1/2), Maple‑4
| {{math|''Np'' + 1/2}}
| Piecewise linear function where the knots are the values midway through the steps of the empirical distribution function.
|-
| R‑6, Excel, Python, SAS‑4, SciPy‑(0,0), Julia-(0,0), Maple‑5, Stata‑altdef
| {{math|(''N'' + 1)''p''}}
| Linear interpolation of the expectations for the order statistics for the uniform distribution on [0,1].  That is, it is the linear interpolation between points {{math|(''p''<sub>''h''</sub>, ''x''<sub>''h''</sub>)}}, where {{math|1=''p''<sub>''h''</sub> = ''h''/(''N''+1)}} is the probability that the last of ({{math|''N''+1}}) randomly drawn values will not exceed the {{mvar|h}}-th smallest of the first {{mvar|N}} randomly drawn values.
|-
| R‑7, Excel, Python, SciPy‑(1,1), Julia-(1,1), Maple‑6, NumPy
| {{math|(''N'' − 1)''p'' + 1}}
| Linear interpolation of the modes for the order statistics for the uniform distribution on [0,1].
|-
| R‑8, SciPy‑(1/3,1/3), Julia‑(1/3,1/3), Maple‑7
| {{math|(''N'' + 1/3)''p'' + 1/3}}
| Linear interpolation of the approximate medians for order statistics.
|-
| R‑9, SciPy‑(3/8,3/8), Julia‑(3/8,3/8), Maple‑8
| {{math|(''N'' + 1/4)''p'' + 3/8}}
| The resulting quantile estimates are approximately unbiased for the expected order statistics if {{mvar|x}} is normally distributed.
|}

Notes:
*R‑1 through R‑3 are piecewise constant, with discontinuities.
*R‑4 and following are piecewise linear, without discontinuities, but differ in how {{mvar|h}} is computed.
*R‑3 and R‑4 are not symmetric in that they do not give {{math|''h'' {{=}} (''N'' + 1) / 2}} when {{math|''p'' {{=}} 1/2}}.
*Excel's PERCENTILE.EXC and Python's default "exclusive" method are equivalent to R‑6.
*Excel's PERCENTILE and PERCENTILE.INC and Python's optional "inclusive" method are equivalent to R‑7. This is R's and Julia's default method.
*Packages differ in how they estimate quantiles beyond the lowest and highest values in the sample, i.e. {{math|''p'' &lt; 1/''N''}} and {{math|''p'' &gt; (''N'' − 1)/''N''}}.  Choices include returning an error value, computing linear extrapolation, or assuming a constant value.

Of the techniques, Hyndman and Fan recommend R-8, but most statistical software packages have chosen R-6 or R-7 as the default.<ref>{{cite web
 |title=Sample quantiles 20 years later
 |first=Rob J. |last=Hyndman |author-link=Rob J. Hyndman
 |url=https://robjhyndman.com/hyndsight/sample-quantiles-20-years-later/
 |date=28 March 2016 
 |access-date=2020-11-30
 |website=Hyndsignt blog
}}</ref>

The [[standard error (statistics)|standard error]] of a quantile estimate can in general be estimated via the [[bootstrap (statistics)|bootstrap]].  The Maritz–Jarrett method can also be used.<ref>{{cite book |first=Rand R. |last=Wilcox |title=Introduction to Robust Estimation and Hypothesis Testing |year=2010 |publisher=Academic Press |isbn=978-0-12-751542-7 }}</ref>