Editing Generative model (section)

{{Short description|Model for generating observable data in probability and statistics}}
{{About|generative models in the context of statistical classification|generative models of [[Markov decision processes]]|Markov decision process#Simulator models|Generative Modelling Language (GML) in computer graphics and generative computer programming|Generative Modelling Language|Generative Artificial Intelligence (Generative A.I.) models/systems|Generative artificial intelligence}}{{Unfocused|reason=the article covers both generative and discriminative. It should focus on the former, since there is already a dedicated page for [[discriminative model]]|date=May 2025}}
In [[statistical classification]], two main approaches are called the '''generative''' approach and the '''discriminative''' approach. These compute [[classification rule|classifiers]] by different approaches, differing in the degree of [[statistical model]]ling. Terminology is inconsistent,{{efn|Three leading sources, {{harvnb|Ng|Jordan|2002}}, {{harvnb|Jebara|2004}}, and {{harvnb|Mitchell|2015}}, give different divisions and definitions.}} but three major types can be distinguished:<ref name="jebara_2004">{{cite book |first=Tony |last=Jebara |title=Machine Learning: Discriminative and Generative |series=The Springer International Series in Engineering and Computer Science |publisher=Kluwer Academic (Springer) |year=2004 |isbn=978-1-4020-7647-3 |url=https://www.springer.com/us/book/9781402076473 }}</ref>
# A generative model is a [[statistical model]] of the [[joint probability distribution]] <math>P(X, Y)</math> on a given [[observable variable]] ''X'' and [[target variable]] ''Y'';<ref name="ngjordan2002generative">{{harvtxt|Ng|Jordan|2002}}: "Generative classifiers learn a model of the joint probability, <math>p(x, y)</math>, of the inputs ''x'' and the label ''y'', and make their predictions by using Bayes rules to calculate <math>p(y\mid x)</math>, and then picking the most likely label ''y''.</ref> A generative model can be used to "generate" random instances ([[outcome (probability)|outcomes]]) of an observation ''x''.<ref name="mitchell2015generative" />
# A [[discriminative model]] is a model of the [[conditional probability]] <math>P(Y\mid X = x)</math> of the target ''Y'', given an observation ''x''. It can be used to "discriminate" the value of the target variable ''Y'', given an observation ''x''.<ref name="mitchell2015discriminative" />
# Classifiers computed without using a probability model are also referred to loosely as "discriminative".
The distinction between these last two classes is not consistently made;<ref>{{harvnb|Jebara|2004|loc=2.4 Discriminative Learning}}: "This distinction between conditional learning and discriminative learning is not currently a well-established convention in the field."</ref> {{harvtxt|Jebara|2004}} refers to these three classes as ''generative learning'', ''conditional learning'', and ''discriminative learning'', but {{harvtxt|Ng|Jordan|2002}} only distinguish two classes, calling them generative classifiers (joint distribution) and discriminative classifiers (conditional distribution or no distribution), not distinguishing between the latter two classes.<ref>{{harvnb|Ng|Jordan|2002}}: "Discriminative classifiers model the posterior <math>p(y|x)</math> directly, or learn a direct map from inputs ''x'' to the class labels."</ref> Analogously, a classifier based on a generative model is a generative classifier, while a classifier based on a discriminative model is a discriminative classifier, though this term also refers to classifiers that are not based on a model.

Standard examples of each, all of which are [[linear classifier]]s, are:

* generative classifiers: 
** [[naive Bayes classifier]] and 
** [[linear discriminant analysis]]
* discriminative model: 
** [[logistic regression]]

In application to classification, one wishes to go from an observation ''x'' to a label ''y'' (or probability distribution on labels). One can compute this directly, without using a probability distribution (''distribution-free classifier''); one can estimate the probability of a label given an observation, <math>P(Y|X=x)</math> (''discriminative model''), and base classification on that; or one can estimate the joint distribution <math>P(X, Y)</math> (''generative model''), from that compute the conditional probability <math>P(Y|X=x)</math>, and then base classification on that. These are increasingly indirect, but increasingly probabilistic, allowing more [[domain knowledge]] and probability theory to be applied. In practice different approaches are used, depending on the particular problem, and hybrids can combine strengths of multiple approaches.