Editing Naive Bayes classifier (section)

===Semi-supervised parameter estimation===
Given a way to train a naive Bayes classifier from labeled data, it's possible to construct a [[semi-supervised learning|semi-supervised]] training algorithm that can learn from a combination of labeled and unlabeled data by running the supervised learning algorithm in a loop:<ref name="em"/>

#Given a collection <math>D = L \uplus U</math> of labeled samples {{mvar|L}} and unlabeled samples {{mvar|U}}, start by training a naive Bayes classifier on {{mvar|L}}.
#Until convergence, do:
##Predict class probabilities <math>P(C \mid x)</math> for all examples {{mvar|x}} in <math>D</math>.
##Re-train the model based on the ''probabilities'' (not the labels) predicted in the previous step.

Convergence is determined based on improvement to the model likelihood <math>P(D \mid \theta)</math>, where <math>\theta</math> denotes the parameters of the naive Bayes model.

This training algorithm is an instance of the more general [[expectation–maximization algorithm]] (EM): the prediction step inside the loop is the ''E''-step of EM, while the re-training of naive Bayes is the ''M''-step. The algorithm is formally justified by the assumption that the data are generated by a [[mixture model]], and the components of this mixture model are exactly the classes of the classification problem.<ref name="em">{{cite journal |first1=Kamal |last1=Nigam |first2=Andrew |last2=McCallum |first3=Sebastian |last3=Thrun |first4=Tom |last4=Mitchell |title=Learning to classify text from labeled and unlabeled documents using EM |journal=[[Machine Learning (journal)|Machine Learning]] |volume=39 |issue=2/3 |pages=103–134 |year=2000 |url=http://www.kamalnigam.com/papers/emcat-aaai98.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://www.kamalnigam.com/papers/emcat-aaai98.pdf |archive-date=2022-10-09 |url-status=live|doi=10.1023/A:1007692713085 |s2cid=686980 |doi-access=free }}</ref>