Living under the ROC

To diagnose a condition, physicians often resort to different tests. The performance of each of these diagnostic methods can be measured using different characteristics:
  • Predictive value is the probability of correctly identifying a subject's condition given the test result. ✅

  • Youden's J is the likelihood of a positive test result in subjects with the condition versus those without the condition probability of an informed decision. ๐Ÿ‘€

  • Sensitivity is the proportion of people who actually have a target disease that are tested positive (true positive or detection rate).๐Ÿ“

    A negative result in a test with high sensitivity can be useful for ruling out disease, because it has a lower type II error rate.

  • Specificity is the proportion of people who do not have a target disease that are tested negative (true negative rate). The false positive rate (1 − specificity) is the probability of false alarm. ๐Ÿšซ

    A positive result in a test with high specificity can be useful for ruling in disease, because it has a lower type I error rate. 
When used to compare diagnostic accuracy, this enables the choice of the best test or combination of tests for a specific purpose.
 
Example: Rapid antigen tests for SARS-CoV-2 detection have 100% specificity and lower sensitivity compared to RT-PCR detection (see here). ๐Ÿงช

One of the challenges in interpreting the results of diagnostic tests is that they may yield continuous measures. Therefore, it is necessary to determine thresholds.๐Ÿšฆ

Enter receiver operator characteristic curves (ROC): plots of true positives against false positives for all cut-off values. 


๐Ÿ‘†We expect the ROC curve of a diagnostic test with reasonable accuracy to be in the upper left quadrant above the reference line.  

๐Ÿ˜ก When a loose cut-off point value is applied, the point moves upward and to the right along the curve.

๐Ÿ˜Š When a strict cut-off point value is applied, the point on the curve moves downward and to the left along the curve. 

 

But this plot provides more information than just determining cut-off points.

The most commonly used parameter to compare the efficacy of two (or more) diagnostic tests or markers is the area under the curve (AUC). ๐Ÿ’ฏ

For any test to be statistically significant, the lower 95% confidence interval value of the AUC must be >0.5 (above the red reference line in the graph). 

However, for a diagnostic test to be clinically meaningful (meaning, for clinicians to truly base their decisions on its results), only an AUC ≥0.8 is considered acceptable.

Clinical interpretation of the value of a diagnostic test based on AUC ROC (see here).

This is also used in clinical research to measure the inter-observer variability, when two or more observers measure the same continuous variable. ๐Ÿ‘ฅ

 

And that's it regarding study design for now. 

Next, I'll look at examples from my own work.

Comments

Leave a comment!

How to spot fake reviewers: a beginner's guide

Auditing published papers (part I)

IMHO: why open science should adopt double anonymous peer review