Reproducibility of meta-analytic results in systematic reviews of interventions: a meta-research study

2024 Prague [Global Evidence Summit]

Alqaidoom Z¹, Hamilton D¹, McKenzie J¹, Moher D², Moher D³, Nguyen P¹, Page M¹

¹Methods in Evidence Synthesis Unit, School of Public Health and Preventive Medicine, Monash University, St Kilda, Victoria, Australia

²Centre for Journalology, Clinical Epidemiology Program, Ottawa Hospital Research Institute, Ottawa, Ottawa, Canada

³School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, Ottawa, Ottawa, Canada

Background: The results of meta-analyses should be reproducible (ie, others should obtain sufficiently similar results when reanalyzing the same data using the same methods). We aimed to determine how often meta-analyses of the effects of interventions are reproducible.

Methods: We included 121 systematic reviews of health, social, behavioral, or educational interventions indexed in 5 databases in November 2020. We contacted the original authors to obtain data and the analytic code used to generate the first reported (“index”) meta-analysis. If not provided, we extracted the data from the review. Two investigators independently attempted to reproduce the meta-analysis results using the same data, analytic code, or methods description provided in the review. We calculated the limits of agreement between the reproduced and original summary estimates of intervention effect and estimates of I², displayed using Bland-Altman plots.

Results: Twenty-two authors (19%) provided data or analytic code (or both). Of the included meta-analyses, 10 (8%) had insufficient information to attempt reproduction. For all meta-analyses, irrespective of effect measure, there was, on average, no difference between the reproduced and original summary estimates. The difference in summary estimates between the reproduced and original for 95% of the meta-analyses was within -0.007 and 0.006 for meta-analyses of standardized mean differences (n = 30) (Figure A); 5% smaller and 5% larger for meta-analyses of odds ratios (n = 31) (Figure B); and 8% smaller and 12% larger for meta-analyses of risk ratios (n = 22) (Figure C). The difference in estimates of I² between reproduced and original meta-analyses (n = 102) was within -15.5 and 13.3 percentage points for 95% of meta-analyses (Figure D). Comparisons of other statistics will be presented at the Summit.

Conclusion: Few systematic reviewers responded to our request to share data or code or both. Reproduced meta-analytic results were similar to the original results, and the discrepancies observed were unlikely to change the conclusion of the review. To enhance reproducibility attempts in the future, we recommend that systematic reviewers report their statistical methods in greater detail and make data and analytic code publicly available.

Relevance: Our findings will inform strategies to improve reproducibility of meta-analyses, and thereby the public’s trust, in meta-analytic findings.