Editing Naive Bayes classifier (section)

{{Short description|Probabilistic classification algorithm}}
[[Image:Naive corral.png|thumb|Example of a naive Bayes classifier depicted as a Bayesian Network]]
In [[statistics]], '''naive''' (sometimes '''simple''' or '''idiot's''') '''Bayes classifiers''' are a family of "[[Probabilistic classification|probabilistic classifier]]s" which assumes that the features are conditionally independent, given the target class.<ref name="idiots">{{cite journal |last1=Hand |first1=D. J. |last2=Yu |first2=K. |year=2001 |title=Idiot's Bayes — not so stupid after all? |journal=International Statistical Review |volume=69 |issue=3 |pages =385–399 |issn=0306-7734 |doi=10.2307/1403452|jstor=1403452 }}</ref> In other words, a naive Bayes model assumes the information about the class provided by each variable is unrelated to the information from the others, with no information shared between the predictors. The highly unrealistic nature of this assumption, called the '''naive independence assumption''', is what gives the classifier its name. These classifiers are some of the simplest [[Bayesian network]] models.<ref>{{cite web |last1=McCallum |first1=Andrew |title=Graphical Models, Lecture2: Bayesian Network Representation |url=https://people.cs.umass.edu/~mccallum/courses/gm2011/02-bn-rep.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://people.cs.umass.edu/~mccallum/courses/gm2011/02-bn-rep.pdf |archive-date=2022-10-09 |url-status=live |access-date=22 October 2019}}</ref>

Naive Bayes classifiers generally perform worse than more advanced models like [[Logistic regression|logistic regressions]], especially at [[Uncertainty quantification|quantifying uncertainty]] (with naive Bayes models often producing wildly overconfident probabilities). However, they are highly scalable, requiring only one parameter for each feature or predictor in a learning problem. [[Maximum-likelihood estimation|Maximum-likelihood]] training can be done by evaluating a [[closed-form expression]] (simply by counting observations in each group),<ref name="aima"/>{{rp|p=718}} rather than the expensive [[Iterative method|iterative approximation]] algorithms required by most other models.

Despite the use of [[Bayes' theorem]] in the classifier's decision rule, naive Bayes is not (necessarily) a [[Bayesian probability|Bayesian]] method, and naive Bayes models can be fit to data using either [[Bayesian inference|Bayesian]] or [[frequentist]] methods.<ref name="idiots" /><ref name="aima">{{cite AIMA|edition=2}}</ref>