Editing Cluster analysis (section)

==== [[Fowlkes–Mallows Index|Fowlkes–Mallows index]]====
The Fowlkes–Mallows index<ref>{{cite journal | last1 = Fowlkes | first1 = E. B. | last2 = Mallows | first2 = C. L. | year = 1983 | title = A Method for Comparing Two Hierarchical Clusterings | jstor = 2288117 | journal = Journal of the American Statistical Association | volume = 78 | issue = 383| pages = 553–569 | doi = 10.1080/01621459.1983.10478008 }}</ref> computes the similarity between the clusters returned by the clustering algorithm and the benchmark classifications. The higher the value of the Fowlkes–Mallows index the more similar the clusters and the benchmark classifications are. It can be computed using the following formula:
<math>
FM = \sqrt{ \frac {TP}{TP+FP} \cdot \frac{TP}{TP+FN}  }
</math>
where  <math>TP</math> is the number of [[true positive]]s, <math>FP</math> is the number of [[false positives]], and <math>FN</math> is the number of [[false negatives]]. The <math>FM</math> index is the geometric mean of the [[precision (information retrieval)|precision]] and [[recall (information retrieval)|recall]] <math>P</math> and <math>R</math>, and is thus also known as the [[G-measure]], while the F-measure is their harmonic mean.<ref name="powers">{{cite conference | last = Powers | first = David | date = 2003 | conference = International Conference on Cognitive Science | pages = 529–534 | title = Recall and Precision versus the Bookmaker }}</ref><ref>{{cite journal | last1 = Arabie | first1 = P. | year = 1985| title = Comparing partitions | journal = Journal of Classification | volume = 2 | issue = 1| page = 1985 | doi = 10.1007/BF01908075 | s2cid = 189915041 }}</ref> Moreover, [[precision (information retrieval)|precision]] and [[recall (information retrieval)|recall]] are also known as Wallace's indices <math>B^I</math> and <math>B^{II}</math>.<ref>{{cite journal | last1 = Wallace | first1 = D. L. | year = 1983 | title = Comment | journal = Journal of the American Statistical Association | volume = 78 | issue = 383| pages = 569–579 | doi=10.1080/01621459.1983.10478009}}</ref> Chance normalized versions of recall, precision and G-measure correspond to [[Informedness]], [[Markedness]] and [[Matthews correlation coefficient|Matthews Correlation]] and relate strongly to [[Cohen's kappa|Kappa]].<ref name="kappa">{{cite conference | last1 = Powers | first1 = David | title = The Problem with Kappa | conference = European Chapter of the Association for Computational Linguistics | date = 2012 | pages = 345–355}}</ref>