Editing Exponential distribution (section)

==Occurrence and applications==

===Occurrence of events===
The exponential distribution occurs naturally when describing the lengths of the inter-arrival times in a homogeneous [[Poisson process]].

The exponential distribution may be viewed as a continuous counterpart of the [[geometric distribution]], which describes the number of [[Bernoulli trial]]s necessary for a ''discrete'' process to change state. In contrast, the exponential distribution describes the time for a continuous process to change state.

In real-world scenarios, the assumption of a constant rate (or probability per unit time) is rarely satisfied. For example, the rate of incoming phone calls differs according to the time of day. But if we focus on a time interval during which the rate is roughly constant, such as from 2 to 4 p.m. during work days, the exponential distribution can be used as a good approximate model for the time until the next phone call arrives. Similar caveats apply to the following examples which yield approximately exponentially distributed variables:
* The time until a radioactive [[particle decay]]s, or the time between clicks of a [[Geiger counter]]
* The time between receiving one telephone call and the next
* The time until default (on payment to company debt holders) in reduced-form credit risk modeling

Exponential variables can also be used to model situations where certain events occur with a constant probability per unit length, such as the distance between [[mutation]]s on a [[DNA]] strand, or between [[roadkill]]s on a given road.

In [[queuing theory]], the service times of agents in a system (e.g. how long it takes for a bank teller etc. to serve a customer) are often modeled as exponentially distributed variables.  (The arrival of customers for instance is also modeled by the [[Poisson distribution]] if the arrivals are independent and distributed identically.)  The length of a process that can be thought of as a sequence of several independent tasks follows the [[Erlang distribution]] (which is the distribution of the sum of several independent exponentially distributed variables).
[[Reliability theory]] and [[reliability engineering]] also make extensive use of the exponential distribution. Because of the memoryless property of this distribution, it is well-suited to model the constant [[hazard rate]] portion of the [[bathtub curve]] used in reliability theory. It is also very convenient because it is so easy to add [[failure rate]]s in a reliability model. The exponential distribution is however not appropriate to model the overall lifetime of organisms or technical devices, because the "failure rates" here are not constant: more failures occur for very young and for very old systems.
[[File:FitExponDistr.tif|thumb|260px|Fitted cumulative exponential distribution to annually maximum 1-day rainfalls using [[CumFreq]]<ref>{{cite web |url=http://www.waterlog.info/cumfreq.htm| title=Cumfreq, a free computer program for cumulative frequency analysis}}</ref>]]

In [[physics]], if you observe a [[gas]] at a fixed [[temperature]] and [[pressure]] in a uniform [[gravitational field]], the heights of the various molecules also follow an approximate exponential distribution, known as the [[Barometric formula]]. This is a consequence of the entropy property mentioned below.

In [[hydrology]], the exponential distribution is used to analyze extreme values of such variables as monthly and annual maximum values of daily rainfall and river discharge volumes.<ref>{{cite book|editor-last=Ritzema|editor-first=H.P.|title=Frequency and Regression Analysis|year=1994|publisher=Chapter 6 in: Drainage Principles and Applications, Publication 16, International Institute for Land Reclamation and Improvement (ILRI), Wageningen, The Netherlands|pages=[https://archive.org/details/drainageprincipl0000unse/page/175 175–224]|url=https://archive.org/details/drainageprincipl0000unse/page/175| isbn=90-70754-33-9}}</ref>

:The blue picture illustrates an example of fitting the exponential distribution to ranked annually maximum one-day rainfalls showing also the 90% [[confidence belt]] based on the [[binomial distribution]]. The rainfall data are represented by [[plotting position]]s as part of the [[cumulative frequency analysis]].
In operating-rooms management, the distribution of surgery duration for a category of surgeries with [[Predictive methods for surgery duration|no typical work-content]] (like in an emergency room, encompassing all types of surgeries).

===Prediction===
Having observed a sample of ''n'' data points from an unknown exponential distribution a common task is to use these samples to make predictions about future data from the same source. A common predictive distribution over future samples is the so-called plug-in distribution, formed by plugging a suitable estimate for the rate parameter ''λ'' into the exponential density function. A common choice of estimate is the one provided by the principle of maximum likelihood, and using this yields the predictive density over a future sample ''x''<sub>''n''+1</sub>, conditioned on the observed samples ''x'' = (''x''<sub>1</sub>, ..., ''x<sub>n</sub>'') given by
<math display="block">p_{\rm ML}(x_{n+1} \mid x_1, \ldots, x_n) = \left( \frac1{\overline{x}} \right) \exp \left( - \frac{x_{n+1}}{\overline{x}} \right).</math>

The Bayesian approach provides a predictive distribution which takes into account the uncertainty of the estimated parameter, although this may depend crucially on the choice of prior.

A predictive distribution free of the issues of choosing priors that arise under the subjective Bayesian approach is

<math display="block">p_{\rm CNML}(x_{n+1} \mid x_1, \ldots, x_n) = \frac{ n^{n+1} \left( \overline{x} \right)^n }{ \left( n \overline{x} + x_{n+1} \right)^{n+1} },</math>

which can be considered as
# a frequentist [[confidence distribution]], obtained from the distribution of the pivotal quantity <math>{x_{n+1}}/{\overline{x}}</math>;<ref>{{cite journal |last1=Lawless |first1=J. F. |last2=Fredette |first2=M. |title=Frequentist predictions intervals and predictive distributions |journal=Biometrika |year=2005 |volume=92 |issue=3 |pages=529–542 |doi=10.1093/biomet/92.3.529 |doi-access= }}</ref>
# a profile predictive likelihood, obtained by eliminating the parameter ''λ'' from the joint likelihood of ''x''<sub>''n''+1</sub> and ''λ'' by maximization;<ref>{{cite journal | last1 = Bjornstad | first1 = J.F. | year = 1990 | title = Predictive Likelihood: A Review | journal = Statist. Sci. | volume = 5 | issue = 2| pages = 242–254 | doi=10.1214/ss/1177012175| doi-access = free }}</ref>
# an objective Bayesian predictive posterior distribution, obtained using the non-informative [[Jeffreys prior]] 1/''λ'';
# the Conditional Normalized Maximum Likelihood (CNML) predictive distribution, from information theoretic considerations.<ref>D. F. Schmidt and E. Makalic, "[http://www.emakalic.org/blog/wp-content/uploads/2010/04/SchmidtMakalic09b.pdf Universal Models for the Exponential Distribution]", ''[[IEEE Transactions on Information Theory]]'', Volume 55, Number 7, pp. 3087–3090, 2009 {{doi|10.1109/TIT.2009.2018331}}</ref>

The accuracy of a predictive distribution may be measured using the distance or divergence between the true exponential distribution with rate parameter, ''λ''<sub>0</sub>, and the predictive distribution based on the sample ''x''. The [[Kullback–Leibler divergence]] is a commonly used, parameterisation free measure of the difference between two distributions. Letting Δ(''λ''<sub>0</sub>||''p'') denote the Kullback–Leibler divergence between an exponential with rate parameter ''λ''<sub>0</sub> and a predictive distribution ''p'' it can be shown that

<math display="block">\begin{align}
\operatorname{E}_{\lambda_0} \left[ \Delta(\lambda_0\parallel p_{\rm ML}) \right] &= \psi(n) + \frac{1}{n-1} - \log(n) \\
\operatorname{E}_{\lambda_0} \left[ \Delta(\lambda_0\parallel p_{\rm CNML}) \right] &= \psi(n) + \frac{1}{n} - \log(n)
\end{align}</math>

where the expectation is taken with respect to the exponential distribution with rate parameter {{nowrap|''λ''<sub>0</sub> ∈ (0, ∞)}}, and {{nowrap|ψ( · )}} is the digamma function. It is clear that the CNML predictive distribution is strictly superior to the maximum likelihood plug-in distribution in terms of average Kullback–Leibler divergence for all sample sizes {{nowrap|''n'' > 0}}.