Editing Receiver operating characteristic (section)

==ROC space==
[[Image:ROC space-2.png|thumb|right|The ROC space and plots of the four prediction examples.]]
[[Image:roc_curve.svg|thumb|right|The ROC space for a "better" and "worse" classifier.]]

The contingency table can derive several evaluation "metrics" (see infobox). To draw a ROC curve, only the true positive rate (TPR) and false positive rate (FPR) are needed (as functions of some classifier parameter). The TPR defines how many correct positive results occur among all positive samples available during the test. FPR, on the other hand, defines how many incorrect positive results occur among all negative samples available during the test.

A ROC space is defined by FPR and TPR as ''x'' and ''y'' axes, respectively, which depicts relative trade-offs between true positive (benefits) and false positive (costs). Since TPR is equivalent to sensitivity and FPR is equal to 1 − [[specificity (tests)|specificity]], the ROC graph is sometimes called the sensitivity vs (1 − specificity) plot. Each prediction result or instance of a [[confusion matrix]] represents one point in the ROC space.

The best possible prediction method would yield a point in the upper left corner or coordinate (0,1) of the ROC space, representing 100% sensitivity (no false negatives) and 100% [[specificity (tests)|specificity]] (no false positives). The (0,1) point is also called a ''perfect classification''. A random guess would give a point along a diagonal line (the so-called ''line of no-discrimination'') from the bottom left to the top right corners (regardless of the positive and negative [[base rate]]s).<ref>{{Cite web|title=classification - AUC-ROC of a random classifier|url=https://datascience.stackexchange.com/a/31877/108151|access-date=2020-11-30|website=Data Science Stack Exchange}}</ref> An intuitive example of random guessing is a decision by flipping coins. As the size of the sample increases, a random classifier's ROC point tends towards the diagonal line. In the case of a balanced coin, it will tend to the point (0.5, 0.5).

The diagonal divides the ROC space. Points above the diagonal represent good classification results (better than random); points below the line represent bad results (worse than random). Note that the output of a consistently bad predictor could simply be inverted to obtain a good predictor.

Consider four prediction results from 100 positive and 100 negative instances:

{| class="wikitable"
|+
! A !! B !! C !! C′
|-
|
{| style="text-align:center;"
| style=" border:thin solid; padding:1em;" | TP&nbsp;=&nbsp;63 || style=" border:thin solid; padding:1em;" | FN = 37 || 100
|-
| style=" border:thin solid; padding:1em;" | FP&nbsp;=&nbsp;28 || style=" border:thin solid; padding:1em;" | TN = 72 || 100
|-
| 91 || 109 || 200
|}
| style="padding-left:1em;" |
{| style="text-align:center;"
| style=" border:thin solid; padding:1em;" | TP&nbsp;=&nbsp;77 || style=" border:thin solid; padding:1em;" | FN = 23 || 100
|-
| style=" border:thin solid; padding:1em;" | FP&nbsp;=&nbsp;77 || style=" border:thin solid; padding:1em;" | TN = 23 || 100
|-
| 154 || 46 || 200
|}
| style="padding-left:1em;" |
{| style="text-align:center;"
| style=" border:thin solid; padding:1em;" | TP&nbsp;=&nbsp;24 || style=" border:thin solid; padding:1em;" | FN = 76 || 100
|-
| style=" border:thin solid; padding:1em;" | FP&nbsp;=&nbsp;88 || style=" border:thin solid; padding:1em;" | TN = 12 || 100
|-
| 112 || 88 || 200
|}
| style="padding-left:1em;" |
{| style="text-align:center;"
| style=" border:thin solid; padding:1em;" | TP&nbsp;=&nbsp;76 || style=" border:thin solid; padding:1em;" | FN = 24 || 100
|-
| style=" border:thin solid; padding:1em;" | FP&nbsp;=&nbsp;12 || style=" border:thin solid; padding:1em;" | TN = 88 || 100
|-
| 88 || 112 || 200
|}
|-
| style="padding-left:1em;" | TPR = 0.63 || style="padding-left:2em;" | TPR = 0.77 || style="padding-left:2em;" | TPR = 0.24 || style="padding-left:2em;" | TPR = 0.76
|-
| style="padding-left:1em;" | FPR = 0.28 || style="padding-left:2em;" | FPR = 0.77 || style="padding-left:2em;" | FPR = 0.88 || style="padding-left:2em;" | FPR = 0.12
|-
| style="padding-left:1em;" | PPV = 0.69 || style="padding-left:2em;" | PPV = 0.50 || style="padding-left:2em;" | PPV = 0.21 || style="padding-left:2em;" | PPV = 0.86
|-
| style="padding-left:1em;" | F1 = 0.66 || style="padding-left:2em;" | F1 = 0.61 || style="padding-left:2em;" | F1 = 0.23 || style="padding-left:2em;" | F1 = 0.81
|-
| style="padding-left:1em;" | ACC = 0.68 || style="padding-left:2em;" | ACC = 0.50 || style="padding-left:2em;" | ACC = 0.18 || style="padding-left:2em;" | ACC = 0.82
|}

Plots of the four results above in the ROC space are given in the figure. The result of method '''A''' clearly shows the best predictive power among '''A''', '''B''', and '''C'''. The result of '''B''' lies on the random guess line (the diagonal line), and it can be seen in the table that the [[Evaluation of binary classifiers#Single metrics|accuracy]] of '''B''' is 50%. However, when '''C''' is mirrored across the center point (0.5,0.5), the resulting method '''C′''' is even better than '''A'''. This mirrored method simply reverses the predictions of whatever method or test produced the '''C''' contingency table. Although the original '''C''' method has negative predictive power, simply reversing its decisions leads to a new predictive method '''C′''' which has positive predictive power. When the '''C''' method predicts '''p''' or '''n''', the '''C′''' method would predict '''n''' or '''p''', respectively. In this manner, the '''C′''' test would perform the best. The closer a result from a contingency table is to the upper left corner, the better it predicts, but the distance from the random guess line in either direction is the best indicator of how much predictive power a method has. If the result is below the line (i.e. the method is worse than a random guess), all of the method's predictions must be reversed in order to utilize its power, thereby moving the result above the random guess line.