Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Mixture model
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===General mixture model=== A typical finite-dimensional mixture model is a [[hierarchical Bayes model|hierarchical model]] consisting of the following components: *''N'' random variables that are observed, each distributed according to a mixture of ''K'' components, with the components belonging to the same [[parametric family]] of distributions (e.g., all [[normal distribution|normal]], all [[Zipf's law|Zipfian]], etc.) but with different parameters *''N'' random [[latent variable]]s specifying the identity of the mixture component of each observation, each distributed according to a ''K''-dimensional [[categorical distribution]] *A set of ''K'' mixture weights, which are probabilities that sum to 1. *A set of ''K'' parameters, each specifying the parameter of the corresponding mixture component. In many cases, each "parameter" is actually a set of parameters. For example, if the mixture components are [[Gaussian distribution]]s, there will be a [[mean]] and [[variance]] for each component. If the mixture components are [[categorical distribution]]s (e.g., when each observation is a token from a finite alphabet of size ''V''), there will be a vector of ''V'' probabilities summing to 1. In addition, in a [[Bayesian inference|Bayesian setting]], the mixture weights and parameters will themselves be random variables, and [[prior distribution]]s will be placed over the variables. In such a case, the weights are typically viewed as a ''K''-dimensional random vector drawn from a [[Dirichlet distribution]] (the [[conjugate prior]] of the categorical distribution), and the parameters will be distributed according to their respective conjugate priors. Mathematically, a basic parametric mixture model can be described as follows: :<math> \begin{array}{lcl} K &=& \text{number of mixture components} \\ N &=& \text{number of observations} \\ \theta_{i=1 \dots K} &=& \text{parameter of distribution of observation associated with component } i \\ \phi_{i=1 \dots K} &=& \text{mixture weight, i.e., prior probability of a particular component } i \\ \boldsymbol\phi &=& K\text{-dimensional vector composed of all the individual } \phi_{1 \dots K} \text{; must sum to 1} \\ z_{i=1 \dots N} &=& \text{component of observation } i \\ x_{i=1 \dots N} &=& \text{observation } i \\ F(x|\theta) &=& \text{probability distribution of an observation, parametrized on } \theta \\ z_{i=1 \dots N} &\sim& \operatorname{Categorical}(\boldsymbol\phi) \\ x_{i=1 \dots N}|z_{i=1 \dots N} &\sim& F(\theta_{z_i}) \end{array} </math> In a Bayesian setting, all parameters are associated with random variables, as follows: :<math> \begin{array}{lcl} K,N &=& \text{as above} \\ \theta_{i=1 \dots K}, \phi_{i=1 \dots K}, \boldsymbol\phi &=& \text{as above} \\ z_{i=1 \dots N}, x_{i=1 \dots N}, F(x|\theta) &=& \text{as above} \\ \alpha &=& \text{shared hyperparameter for component parameters} \\ \beta &=& \text{shared hyperparameter for mixture weights} \\ H(\theta|\alpha) &=& \text{prior probability distribution of component parameters, parametrized on } \alpha \\ \theta_{i=1 \dots K} &\sim& H(\theta|\alpha) \\ \boldsymbol\phi &\sim& \operatorname{Symmetric-Dirichlet}_K(\beta) \\ z_{i=1 \dots N}|\boldsymbol\phi &\sim& \operatorname{Categorical}(\boldsymbol\phi) \\ x_{i=1 \dots N}|z_{i=1 \dots N},\theta_{i=1 \dots K} &\sim& F(\theta_{z_i}) \end{array} </math> This characterization uses ''F'' and ''H'' to describe arbitrary distributions over observations and parameters, respectively. Typically ''H'' will be the [[conjugate prior]] of ''F''. The two most common choices of ''F'' are [[Gaussian distribution|Gaussian]] aka "[[normal distribution|normal]]" (for real-valued observations) and [[categorical distribution|categorical]] (for discrete observations). Other common possibilities for the distribution of the mixture components are: *[[Binomial distribution]], for the number of "positive occurrences" (e.g., successes, yes votes, etc.) given a fixed number of total occurrences *[[Multinomial distribution]], similar to the binomial distribution, but for counts of multi-way occurrences (e.g., yes/no/maybe in a survey) *[[Negative binomial distribution]], for binomial-type observations but where the quantity of interest is the number of failures before a given number of successes occurs *[[Poisson distribution]], for the number of occurrences of an event in a given period of time, for an event that is characterized by a fixed rate of occurrence *[[Exponential distribution]], for the time before the next event occurs, for an event that is characterized by a fixed rate of occurrence *[[Log-normal distribution]], for positive real numbers that are assumed to grow exponentially, such as incomes or prices *[[Multivariate normal distribution]] (aka multivariate Gaussian distribution), for vectors of correlated outcomes that are individually Gaussian-distributed *[[multivariate t-distribution|Multivariate Student's ''t''-distribution]], for vectors of heavy-tailed correlated outcomes<ref>{{cite journal |first1=Sotirios P. |last1=Chatzis |first2=Dimitrios I. |last2=Kosmopoulos |first3=Theodora A. |last3=Varvarigou |title=Signal Modeling and Classification Using a Robust Latent Space Model Based on t Distributions |journal=IEEE Transactions on Signal Processing |volume=56 |issue=3 |pages=949β963 |year=2008 |doi=10.1109/TSP.2007.907912 |bibcode=2008ITSP...56..949C |s2cid=15583243 }}</ref> *A vector of [[Bernoulli distribution|Bernoulli]]-distributed values, corresponding, e.g., to a black-and-white image, with each value representing a pixel; see the handwriting-recognition example below
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)