An empirical comparison of methods for meta-analysis of studies of diagnostic accuracy

Article type
Authors
Harbord R, Bachmann L, Shang A, Whiting P, Deeks J, Egger M, Sterne J
Abstract
Background: Reviews of diagnostic test accuracy studies are being incoporated into the Cochrane Collaboration. There is limited practical experience with statistical methods for such reviews, and uncertainty as to the most appropriate choice of method.

Objective: To compare the results of four methods for meta-analysis of diagnostic accuracy studies when applied to 9 diagnostic meta-analyses.

Methods: We applied the following models:

1. Separate random-effects meta-analysis of sensitivity and specificity, ignoring any correlation between them, fitted by (1a) random-effects meta-analysis of logit-transformed sensitivity and specificity and (1b) using random-effects logistic regression;

2. Summary ROC (SROC) curve fitted using simple linear regression [Littenberg & Moses, Med Decis Making 1993];

3. Bivariate random-effects meta-analysis of (logit-transformed) sensitivity and specificity [van Houwelingen et al., Stat Med 2002];

4. Hierarchical SROC (HSROC) model [Rutter & Gatsonis, Stat Med 2001].

We fitted model 2 in Stata, and models 3 and 4 using the SAS NLMIXED procedure. We did not include covariates. We compared summary estimates of sensitivity and specificity (models 1, 3 and 4) and SROCs (models 2 and 4).

Results: In all examples, there was evidence of heterogeneity in one or more parameters. Methods 3 and 4 gave the same summary estimates in all cases. The results of method 1a were very similar to these. Method 1b gave results differing noticeably from these others in 2 cases out of 9. In 6 cases, the SROC estimated using method 2 was similar to that using method 4. In two cases method 2 gave a more asymmetric SROC than method 4 but the difference was small over the range of the data. In one case, in which there was a positive correlation between study estimates of sensitivity and specificity, the SROC estimated using method 2 was strikingly different from that using method 4.

Conclusions: Separate meta-analysis of logit-transformed sensitivity and specificity provides a summary operating point very similar to that from more sophisticated methods. The Littenberg-Moses method often gives a reasonable approximation to the summary ROC curve produced by the HSROC model, but the curve should not be extrapolated beyond the range of the data.