An empirical assessment of bivariate methods for meta-analysis of test performance

2012 Auckland

Dahabreh IJ¹, Trikalinos TA², Lau J¹, Schmid CH³

¹Center for Clinical Evidence Synthesis, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA

²Center for Evidence-based Medicine, Program in Public Health, Brown University, Providence, USA

³Biostatistics Research Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA

Background: Meta-analysis of diagnostic test accuracy studies must account for correlation between sensitivity and specificity. Summary receiver operating characteristic (SROC) curves and bivariate models have become methods of choice. However, different forms of these techniques exist and methods based on univariate models are still in common use.

Objectives: Empirically compare different approaches to analyzing sensitivity and specificity using a large dataset of published diagnostic test meta-analyses.

Methods: We searched PubMed (1987–2003) for meta-analyses that reported 2 × 2 tables for at least one diagnostic outcome. Methods evaluated included: fixed effect univariate meta-analyses; univariate and bivariate random effects meta-analyses with maximum likelihood (ML) using both normal approximation and exact binomial likelihood. The bivariate model with exact binomial likelihood was also fit using a fully Bayesian approach. We constructed SROC curves using the Moses-Littenberg fixed effects method (weighted and unweighted), the Rutter-Gatsonis hierarchical SROC (HSROC) method, and four alternative HSROC methods.

Results: We found 308 meta-analyses. All normal approximation methods estimated summary sensitivity and specificity closer to 0.5 and gave smaller standard errors compared to exact binomial likelihood methods. Marginal results of univariate and bivariate random effects meta-analyses were similar regardless of the estimation method. Bivariatemodels using ML and fully Bayesian methods gave similar point estimates but Bayesian models indicated additional uncertainty. All bivariate methods poorly estimated the correlation between sensitivity and specificity. Moses-Littenberg and Rutter-Gatsonis SROC curves produced similar results. Alternative parameterizations of the HSROC regression resulted in markedly different summary lines in one-third of the meta-analyses; this depended on the estimated covariance between sensitivity and specificity.

Conclusions: Meta-analytic summaries of sensitivity and specificity can be sensitive to model choice when information is sparse, particularly when the correlation between sensitivity and specificity is large and poorly estimated. In such cases, Bayesian bivariate methods with exact likelihoods can appropriately quantify estimation uncertainty.