Living under the ROC

To diagnose a condition, physicians often resort to different tests. The performance of each of these diagnostic methods can be measured using different characteristics:
  • Predictive value is the probability of correctly identifying a subject's condition given the test result. ✅

  • Youden's J is the likelihood of a positive test result in subjects with the condition versus those without the condition probability of an informed decision. 👀

  • Sensitivity is the proportion of people who actually have a target disease that are tested positive (true positive or detection rate).📍

    A negative result in a test with high sensitivity can be useful for ruling out disease, because it has a lower type II error rate.

  • Specificity is the proportion of people who do not have a target disease that are tested negative (true negative rate). The false positive rate (1 − specificity) is the probability of false alarm. 🚫

    A positive result in a test with high specificity can be useful for ruling in disease, because it has a lower type I error rate. 
When used to compare diagnostic accuracy, this enables the choice of the best test or combination of tests for a specific purpose.
 
Example: Rapid antigen tests for SARS-CoV-2 detection have 100% specificity and lower sensitivity compared to RT-PCR detection (see here). 🧪

One of the challenges in interpreting the results of diagnostic tests is that they may yield continuous measures. Therefore, it is necessary to determine thresholds.🚦

Enter receiver operator characteristic curves (ROC): plots of true positives against false positives for all cut-off values. 


👆We expect the ROC curve of a diagnostic test with reasonable accuracy to be in the upper left quadrant above the reference line.  

😡 When a loose cut-off point value is applied, the point moves upward and to the right along the curve.

😊 When a strict cut-off point value is applied, the point on the curve moves downward and to the left along the curve. 

 

But this plot provides more information than just determining cut-off points.

The most commonly used parameter to compare the efficacy of two (or more) diagnostic tests or markers is the area under the curve (AUC). 💯

For any test to be statistically significant, the lower 95% confidence interval value of the AUC must be >0.5 (above the red reference line in the graph). 

However, for a diagnostic test to be clinically meaningful (meaning, for clinicians to truly base their decisions on its results), only an AUC ≥0.8 is considered acceptable.

Clinical interpretation of the value of a diagnostic test based on AUC ROC (see here).

This is also used in clinical research to measure the inter-observer variability, when two or more observers measure the same continuous variable. 👥

 

And that's it regarding study design for now. 

Next, I'll look at examples from my own work.

Comments

Leave a comment!

Pride and Respect

London Book Fair 2024: OA and AI

London Book Fair 2024: challenges for publishers