Editing Nonparametric statistics (section)

==Definitions==

The term "nonparametric statistics" has been defined imprecisely in the following two ways, among others:

The first meaning of ''nonparametric'' involves techniques that do not rely on data belonging to any particular parametric family of probability distributions.  
These include, among others:
* Methods which are ''distribution-free'', which do not rely on assumptions that the data are drawn from a given parametric family of [[probability distributions]]. 
* Statistics defined to be a function on a sample, without dependency on a [[parameter]]. 
An example is [[Order statistic]]s, which are based on [[Ranking#Ordinal ranking ("1234" ranking)|ordinal ranking]] of observations.

The discussion following is taken from ''Kendall's Advanced Theory of Statistics''.<ref>Stuart A., Ord J.K, Arnold S. (1999), ''Kendall's Advanced Theory of Statistics: Volume 2A—Classical Inference and the Linear Model'', sixth edition, §20.2–20.3 ([[Edward Arnold (publisher)|Arnold]]).</ref>

<blockquote>
Statistical hypotheses concern the behavior of observable random variables....  For example, the hypothesis (a) that a normal distribution has a specified mean and variance is statistical; so is the hypothesis (b) that it has a given mean but unspecified variance; so is the hypothesis (c) that a distribution is of normal form with both mean and variance unspecified; finally, so is the hypothesis (d) that two unspecified continuous distributions are identical.

It will have been noticed that in the examples (a) and (b) the distribution underlying the observations was taken to be of a certain form (the normal) and the hypothesis was concerned entirely with the value of one or both of its parameters.  Such a hypothesis, for obvious reasons, is called ''parametric''.

Hypothesis (c) was of a different nature, as no parameter values are specified in the statement of the hypothesis; we might reasonably call such a hypothesis ''non-parametric''.  Hypothesis (d) is also non-parametric but, in addition, it does not even specify the underlying form of the distribution and may now be reasonably termed ''distribution-free''.  Notwithstanding these distinctions, the statistical literature now commonly applies the label "non-parametric" to test procedures that we have just termed "distribution-free", thereby losing a useful classification.
</blockquote>

The second meaning of ''non-parametric'' involves techniques that do not assume that the ''structure'' of a model is fixed.  Typically, the model grows in size to accommodate the complexity of the data.  In these techniques, individual variables ''are'' typically assumed to belong to parametric distributions, and assumptions about the types of associations among variables are also made.  These techniques include, among others:
* ''[[nonparametric regression|non-parametric regression]]'', which is modeling whereby the structure of the relationship between variables is treated non-parametrically, but where nevertheless there may be parametric assumptions about the distribution of model residuals.
* ''non-parametric hierarchical Bayesian models'', such as models based on the [[Dirichlet process]], which allow the number of [[latent variables]] to grow as necessary to fit the data, but where individual variables still follow parametric distributions and even the process controlling the rate of growth of latent variables follows a parametric distribution.