Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Binary classification
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Evaluation== {{main|Evaluation of binary classifiers}} From tallies of the four basic outcomes, there are many approaches that can be used to measure the accuracy of a classifier or predictor. Different fields have different preferences. ===The eight basic ratios=== A common approach to evaluation is to begin by computing two ratios of a standard pattern. There are eight basic ratios of this form that one can compute from the contingency table, which come in four complementary pairs (each pair summing to 1). These are obtained by dividing each of the four numbers by the sum of its row or column, yielding eight numbers, which can be referred to generically in the form "true positive row ratio" or "false negative column ratio". There are thus two pairs of column ratios and two pairs of row ratios, and one can summarize these with four numbers by choosing one ratio from each pair β the other four numbers are the complements. The row ratios are: *[[true positive rate]] (TPR) = (TP/(TP+FN)), aka '''[[Sensitivity (tests)|sensitivity]]''' or [[Recall (information retrieval)|recall]]. These are the proportion of the ''population with the condition'' for which the test is correct. **with complement the [[false negative rate]] (FNR) = (FN/(TP+FN)) *[[true negative rate]] (TNR) = (TN/(TN+FP), aka '''[[Specificity (tests)|specificity]]''' (SPC), **with complement [[false positive rate]] (FPR) = (FP/(TN+FP)), also called independent of [[prevalence]] The column ratios are: *[[Positive Predictive Value|positive predictive value]] (PPV, aka [[Precision (information retrieval)|precision]]) (TP/(TP+FP)). These are the proportion of the ''population with a given test result'' for which the test is correct. **with complement the [[false discovery rate]] (FDR) (FP/(TP+FP)) *[[negative predictive value]] (NPV) (TN/(TN+FN)) **with complement the [[false omission rate]] (FOR) (FN/(TN+FN)), also called dependence on prevalence. In diagnostic testing, the main ratios used are the true column ratios β true positive rate and true negative rate β where they are known as [[sensitivity and specificity]]. In informational retrieval, the main ratios are the true positive ratios (row and column) β positive predictive value and true positive rate β where they are known as [[precision and recall]]. Cullerne Bown has suggested a flow chart for determining which pair of indicators should be used when.<ref name="CullerneBown2024"> {{Cite journal | author = William Cullerne Bown | title = Sensitivity and Specificity versus Precision and Recall, and Related Dilemmas | journal = [[Journal of Classification]] | year = 2024 | volume = 41 | issue = 2 | pages = 402β426 | doi = 10.1007/s00357-024-09478-y | url = https://rdcu.be/dL1wK | url-access = subscription }} </ref> Otherwise, there is no general rule for deciding. There is also no general agreement on how the pair of indicators should be used to decide on concrete questions, such as when to prefer one classifier over another. One can take ratios of a complementary pair of ratios, yielding four [[Likelihood ratios in diagnostic testing|likelihood ratios]] (two column ratio of ratios, two row ratio of ratios). This is primarily done for the column (condition) ratios, yielding [[likelihood ratios in diagnostic testing]]. Taking the ratio of one of these groups of ratios yields a final ratio, the [[diagnostic odds ratio]] (DOR). This can also be defined directly as (TPΓTN)/(FPΓFN) = (TP/FN)/(FP/TN); this has a useful interpretation β as an [[odds ratio]] β and is prevalence-independent. ===Other metrics=== There are a number of other metrics, most simply the [[Accuracy and precision#In binary classification|accuracy]] or Fraction Correct (FC), which measures the fraction of all instances that are correctly categorized; the complement is the Fraction Incorrect (FiC). The [[F-score]] combines precision and recall into one number via a choice of weighing, most simply equal weighing, as the balanced F-score ([[F1 score]]). Some metrics come from [[regression coefficient]]s: the [[markedness]] and the [[informedness]], and their [[geometric mean]], the [[Matthews correlation coefficient]]. Other metrics include [[Youden's J statistic]], the [[uncertainty coefficient]], the [[phi coefficient]], and [[Cohen's kappa]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)