Systematic reviews should explore a priori hypotheses to explain heterogeneity even when I2 is low

Article type
Authors
Siemieniuk R1, Meade M1, Alonso P2, Briel M1, Vandvik P3, Guyatt G1
1McMaster University, Canada
2Iberoamerican Cochrane Center, Biomedical Research Institute (IIB-Sant Pau-CIBERESP), Barcelona, Spain
3Norwegian Knowledge Centre for the Health Services, Oslo, Norway
Abstract
Background: There is general agreement that systematic review authors should generate a priori hypotheses to explain heterogeneity and test these hypotheses when heterogeneity proves substantial. However, when the meta-analysis suggests low heterogeneity, as represented by a low I2, controversy exists. In these circumstances, some advocate, and practice, omitting statistical exploration of heterogeneity. Others disagree.
Objective: To illustrate the advisability of exploring possible subgroup effects even when I2 is low.
Method: We conducted a systematic review and meta-analysis addressing the desirability of adjunctive administration of corticosteroids in patients with community-acquired pneumonia. We generated four a priori hypotheses to explain heterogeneity, including the severity of pneumonia (expectation of larger effect on mortality when over 70% of patients had severe pneumonia).
Results: Random-effects meta-analysis showed a relative risk (RR) of 0.67, 95% confidence interval (CI) 0.47 to 0.97, I2 =7%, for overall mortality (Figure 1). Despite the low I2 we undertook Chi-square tests for effect modification for our a priori hypotheses. We found an apparent mortality benefit in trials that met our 'more severe' criteria (6 studies; n = 388; RR = 0.39, 95% CI 0.22 to 0.67; I2 = 0%) but not in those that did not (6 studies; n = 1586; RR = 1.00, 95% CI 0.64 to 1.56; I2 = 0%; interaction P = 0.009; Figure 2). The subgroup finding gains credibility from the large magnitude of effect, its biological plausibility (a greater inflammatory response in more severe pneumonia), the small number of a priori hypotheses with specified direction, and a small interaction P value. It is based, however, on differences between studies rather than within studies, and was driven to a considerable extent by a small study that was stopped early for benefit and almost certainly represents a large overestimate of effect. Overall, the credibility of the subgroup effect is moderate.
Conclusions: A low I2 should not deter systematic review authors from exploring a priori hypotheses to explain heterogeneity.