When can we trust ‘early’ statistically significant treatment effect estimates in cumulative cardiology meta-analyses? – a simulation study

Article type
Authors
Thorlund K1, Walsh M1, Imberger G2, Chu R1, Gluud C3, Wetterslev J4, Guyatt G, Devereaux P1, Thabane L1
1Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada
2Cochrane Anaesthesia Review Group, Copenhagen, Denmark
3Copenhagen Trial Unit, Cochrane Hepato-Biliary Group, Copenhagen, Denmark
4Copenhagen Trial Unit, Centre of Clinical Intervention Research, Rigshospitalet, Copenhagen, Denmark
Abstract
Introduction: Meta-analyses that only include a cumulative small number of patients and events (‘early’ meta-analyses) are often underpowered to detect realistic treatment effects. Nevertheless, many examples of statistically significant ‘early’ meta-analyses exist in the medical literature. Statistically significant results in early meta-analyses can only occur if 1) the treatment effect is overestimated; 2) the standard error is underestimated. Most meta-analysts typically assume time-lag or publication bias is the cause of ‘early’ statistical significance, but theoretical considerations suggest that random error (the play of chance) can also substantially impact the results of ‘early’ meta-analyses. We performed a simulation to explore the extent to which random error causes overestimation of treatment effects in ‘early’ statistically significant cardiology meta-analyses. Methods: To facilitate a ‘realistic’ cardiology random-effects model metaanalysis simulation, we surveyed all meta-analyses on mortality from the Cochrane Heart Group and utilized the observed distribution of trial sizes, trial control group event rates, and trial treatment effects to set the simulation parameters. We simulated 10,000 meta-analyses with an overall true treatment effect of 0.80 relative risk, individual trial effects varying around 0.80 with possible extremes of 0.60 and 1.05 (i.e., moderate heterogeneity). We performed cumulative meta-analysis on each of the 10,000 simulated meta-analyses. Among statistically significant (one-sided alpha = 0.025) cumulative point estimates, we calculated the proportion that were smaller than 0.70 and 0.60. Results: Among the surveyed meta-analyses, trial sizes varied between 40–400 (25%), 401–1000 (65%) and 1001–10,000 (10%) and control group event rates ranged from 1%to 15%. Figures 1 and 2 present the proportion of statistically significant meta-analyses that yielded RR estimates smaller than 0.70 and 0.60 in relation to the cumulative number of events and patients. Conclusion: In ‘early’ statistically significant meta-analyses, random error alone may cause overestimation. Depending on what is considered a ‘clinically important’ overestimate, about 8,000–20,000 patients or 400–2,000 events are needed to avoid spurious results.