Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Cluster analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== [[Fowlkes–Mallows Index|Fowlkes–Mallows index]]==== The Fowlkes–Mallows index<ref>{{cite journal | last1 = Fowlkes | first1 = E. B. | last2 = Mallows | first2 = C. L. | year = 1983 | title = A Method for Comparing Two Hierarchical Clusterings | jstor = 2288117 | journal = Journal of the American Statistical Association | volume = 78 | issue = 383| pages = 553–569 | doi = 10.1080/01621459.1983.10478008 }}</ref> computes the similarity between the clusters returned by the clustering algorithm and the benchmark classifications. The higher the value of the Fowlkes–Mallows index the more similar the clusters and the benchmark classifications are. It can be computed using the following formula: <math> FM = \sqrt{ \frac {TP}{TP+FP} \cdot \frac{TP}{TP+FN} } </math> where <math>TP</math> is the number of [[true positive]]s, <math>FP</math> is the number of [[false positives]], and <math>FN</math> is the number of [[false negatives]]. The <math>FM</math> index is the geometric mean of the [[precision (information retrieval)|precision]] and [[recall (information retrieval)|recall]] <math>P</math> and <math>R</math>, and is thus also known as the [[G-measure]], while the F-measure is their harmonic mean.<ref name="powers">{{cite conference | last = Powers | first = David | date = 2003 | conference = International Conference on Cognitive Science | pages = 529–534 | title = Recall and Precision versus the Bookmaker }}</ref><ref>{{cite journal | last1 = Arabie | first1 = P. | year = 1985| title = Comparing partitions | journal = Journal of Classification | volume = 2 | issue = 1| page = 1985 | doi = 10.1007/BF01908075 | s2cid = 189915041 }}</ref> Moreover, [[precision (information retrieval)|precision]] and [[recall (information retrieval)|recall]] are also known as Wallace's indices <math>B^I</math> and <math>B^{II}</math>.<ref>{{cite journal | last1 = Wallace | first1 = D. L. | year = 1983 | title = Comment | journal = Journal of the American Statistical Association | volume = 78 | issue = 383| pages = 569–579 | doi=10.1080/01621459.1983.10478009}}</ref> Chance normalized versions of recall, precision and G-measure correspond to [[Informedness]], [[Markedness]] and [[Matthews correlation coefficient|Matthews Correlation]] and relate strongly to [[Cohen's kappa|Kappa]].<ref name="kappa">{{cite conference | last1 = Powers | first1 = David | title = The Problem with Kappa | conference = European Chapter of the Association for Computational Linguistics | date = 2012 | pages = 345–355}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)