Evaluating heterogeneity in studies of diagnostic test accuracy

2006 Dublin

Schmid C, Chung M, Tatsioni A, Lyn Price L, Lau J

Background: Statistical methods for evaluating meta-analysis (MA) of diagnostic tests have lagged behind those for MA of treatment comparisons. Many papers continue to report single estimates of sensitivity, specificity and odds ratios, when more sophisticated methods to address study heterogeneity are available.

Objectives: To compare the performance of different models applied to different diagnostic test outcomes in the collection of all diagnostic test MAs identified in the peer-reviewed literature.

Methods: We collected all MAs of diagnostic tests published through 2003 identified from a MEDLINE search supplemented by
references from review articles. We recorded the number of true positives, true negatives, false positives and false negatives and studylevel covariates. Analyses included fixed-effect and random-effects models of sensitivity, specificity, positive and negative likelihood ratios and diagnostic odds ratio (DOR) as well as weighted and unweighted SROC analysis. We also performed random-effects and SROC meta-regression.

Results: Two hundred and ninety-eight MAs incorporating 1410 study-level covariates were reported in 249 publications between 1987 and 2003. Heterogeneity was common for all outcomes, ranging between 66% for DOR to 85% for specificity. A large percentage of covariates correlated significantly with outcomes (> 90% for DOR). The chance of significant meta-regression also varied with studylevel covariates. Unweighted SROC analyses were more conservative than weighted ones. One in six unweighted analyses showed correlation between DOR and the diagnostic threshold. After controlling for threshold, one in six covariates examined correlated with DOR. Areas under the SROC curve and Q* were generally high, with medians of 0.92 and 0.86 respectively.

Conclusions: The high prevalence of heterogeneity argues for the universal use of random-effects models for diagnostic test MA. Metaregression promises to uncover potential sources of heterogeneity. SROC curves are often asymmetric and multiple curves may be required to describe performance variation by characteristics of studies, tests and patients.