Systematic Reviews of diagnostic test accuracy: a pilot review

Article type
Authors
M. Leeflang M, Hooft L, B. Reitsma J, Scholten R
Abstract
Background: In July 2005 the pilot phase of the implementation plan for Cochrane Reviews of Diagnostic Test Accuracy started. Its main goal was to field-test the methods and material provided by the Diagnostic Test Accuracy working group. This report is about one of those pilot reviews on diagnostic test accuracy. We will focus on the methodological challenges met during the review process.
Objectives: To obtain summary estimates of the diagnostic accuracy of a commercial laboratory ELISA test for the diagnosis of invasive aspergillosis (IA), the most common life-threatening opportunistic invasive mycosis in immunocompromized patients. The other objective was to test the methods and material provided by the Diagnostic Test Accuracy working group.
Methods: We systematically reviewed studies assessing the diagnostic accuracy of a commercial ELISA, according to the (draft) Cochrane Diagnostic Reviewers' Handbook. We subtracted 2-by-2 tables from the studies and subsequently meta-analyzed pairs of sensitivity and specificity using a bivariate random effects approach, via PROC MIXED in SAS, version 9.1.3 (Cary, NC).
Results: During the review process, we met a few methodological challenges. First, the commonly accepted reference standard consisted of a set of criteria that categorizes patients into four groups: proven IA, probable IA, possible IA and no IA. This resulted in a 2-by-4 table for each included study. Converting all 2-by-4 tables into 2-by-2 tables, resulted in an overall sensitivity of 75% (65% - 82%) and an overall specificity of 90% (84% - 93%). Another solution to analyze the 2-by-4 tables was to extent the bivariate analysis to a multivariate method. Second, reporting of study design features and quality issues was poor: less than 50% of the included studies reported sufficient data on inclusion criteria, interval between index test and diagnosis, blinding of reference test results and index test results, data on indeterminate results and information about sponsoring. Third, the test was used in different ways. 7 out of 20 studies used the ELISA as a monitoring tool (once or twice a week) and 7 used it as a diagnostic tool or as a tool to guide therapy. Six studies did not report how they used the test in practice. Subgroup analysis revealed no significant differences in diagnostic accuracy. Fourth, definitions for test positivity also varied between studies. Three different cut-off values were reported most, but others were also used. If more than one cut-off value was reported, we chose the lowest one to include in the analysis. Four studies reported cut-off values for indeterminate results, but they did not report how the indeterminate results were handled in the analyses. Some studies regarded one positive sample as positive result, while others regarded two subsequent positive samples as positive result. Subgroup analysis for the different cut-off values did not show any differences in diagnostic accuracy, although although a clear threshold effect was present in the ROC plot. The threshold effect was mainly caused by differences in reference test: sensitivity in studies that used the more recently published reference criteria was 68% (58% - 77%), while studies that did not refer to this publication reported a sensitivity of 86% (76% - 92%). The difference in specificity was less (89% (81% - 93%) versus 92% (82% - 96%)).
Conclusions: Although methods for meta-analyzing diagnostic data are getting clearer, some data remain difficult to analyze and interpret. We will provide some possible solutions for the challenges we encountered.