Article type
Year
Abstract
Background: In the majority of diagnostic reviews there is more variability in accuracy measures than can be expected due to chance alone. As a variety of approaches exist for how reviewers examine, measure, report and interpret their results in such circumstances, more guidance is urgently needed.
Objectives: To describe the methods currently used in diagnostic reviews to visualize, quantify, and report statistical heterogeneity in accuracy results between primary studies and to explore how the results of this examination influence subsequent analysis decisions and formulation of conclusions.
Methods: Systematic reviews on diagnostic tests published in MEDLINE-indexed journals between May and September 2012 were identified using a systematic search. Using a standardized form, information was extracted on the clinical context and methods applied from themainmeta-analysis in each review.
Results: 53 meta-analyses met inclusion criteria. These meta-analyses contained a median of 14 primary studies (IQR = 9.5–20.5). Statistical tests for heterogeneity were used in only 72% of the meta-analyses. The most common tests were I2 (29), followed by χ2 (26), and τ2 (5). Heterogeneity was represented visually in all but 5 studies; 40 plotted sensitivity and specificity in ROC space and 34 presented forest plots. Data on how the investigation of statistical heterogeneity influenced subsequent analysis decisions (i.e. whether to investigate sources of heterogeneity) and the formulation of conclusions will be available before the colloquium.
Conclusions: The exploration of statistical heterogeneity in diagnostic accuracy meta-analyses is increasing, although not yet universal. However, there is a lack of consistency in which heterogeneity tests are used, how these tests are interpreted, and how these results influence subsequent analysis decisions and conclusions. In a diagnostic meta-analysis, because mean values are difficult to interpret and translate to clinical practice and because confidence intervals and ellipses do not accurately reflect the amount of between-study variation, identifying sources of variability becomes important.
Objectives: To describe the methods currently used in diagnostic reviews to visualize, quantify, and report statistical heterogeneity in accuracy results between primary studies and to explore how the results of this examination influence subsequent analysis decisions and formulation of conclusions.
Methods: Systematic reviews on diagnostic tests published in MEDLINE-indexed journals between May and September 2012 were identified using a systematic search. Using a standardized form, information was extracted on the clinical context and methods applied from themainmeta-analysis in each review.
Results: 53 meta-analyses met inclusion criteria. These meta-analyses contained a median of 14 primary studies (IQR = 9.5–20.5). Statistical tests for heterogeneity were used in only 72% of the meta-analyses. The most common tests were I2 (29), followed by χ2 (26), and τ2 (5). Heterogeneity was represented visually in all but 5 studies; 40 plotted sensitivity and specificity in ROC space and 34 presented forest plots. Data on how the investigation of statistical heterogeneity influenced subsequent analysis decisions (i.e. whether to investigate sources of heterogeneity) and the formulation of conclusions will be available before the colloquium.
Conclusions: The exploration of statistical heterogeneity in diagnostic accuracy meta-analyses is increasing, although not yet universal. However, there is a lack of consistency in which heterogeneity tests are used, how these tests are interpreted, and how these results influence subsequent analysis decisions and conclusions. In a diagnostic meta-analysis, because mean values are difficult to interpret and translate to clinical practice and because confidence intervals and ellipses do not accurately reflect the amount of between-study variation, identifying sources of variability becomes important.