Abstract
Binary classifiers are routinely evaluated with performance measures such as sensitivity and specificity, and performance is frequently illustrated with Receiver Operating Characteristics (ROC) plots. Alternative measures such as positive predictive value (PPV) and the associated Precision/Recall (PRC) plots are used less frequently. Many bioinformatics studies develop and evaluate classifiers that are to be applied to strongly imbalanced datasets in which the number of negatives outweighs the number of positives significantly. While ROC plots are visually appealing and provide an overview of a classifier's performance across a wide range of specificities, one can ask whether ROC plots could be misleading when applied in imbalanced classification scenarios. We show here that the visual interpretability of ROC plots in the context of imbalanced datasets can be deceptive with respect to conclusions about the reliability of classification performance, owing to an intuitive but wrong interpretation of specificity. PRC plots, on the other hand, can provide the viewer with an accurate prediction of future classification performance due to the fact that they evaluate the fraction of true positives among positive predictions. Our findings have potential implications for the interpretation of a large number of studies that use ROC plots on imbalanced datasets.
Keywords
Affiliated Institutions
Related Publications
ROC Curves for Classification Trees
A common problem in medical diagnosis is to combine information from several tests or patient characteristics into a decision rule to distinguish diseased from healthy patients....
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
Abstract Background To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the ...
Interpretable Classification Models for Recidivism Prediction
Summary We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent and interpretable to use for d...
Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine
Abstract The clinical performance of a laboratory test can be described in terms of diagnostic accuracy, or the ability to correctly classify subjects into clinically relevant s...
A review of methods for the assessment of prediction errors in conservation presence/absence models
Predicting the distribution of endangered species from habitat data is frequently perceived to be a useful technique. Models that predict the presence or absence of a species ar...
Publication Info
- Year
- 2015
- Type
- article
- Volume
- 10
- Issue
- 3
- Pages
- e0118432-e0118432
- Citations
- 3921
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1371/journal.pone.0118432