I-square statistic in meta-analysis of prevalence: worthwhile or worthless?

Article type
Authors
Borges Migliavaca C1, Stein C2, Colpani V2, Hugh Barker T3, Munn Z3, Falavigna M2
1National Institute for Health Technology Assessment, Post-Graduate Program in Epidemiology, Federal University of Rio Grande do Sul
2Hospital Moinhos de Vento
3Joanna Briggs Institute, University of Adelaide
Abstract
Background: Prevalence estimates are critical for health decision making, with systematic reviews (SR) and meta-analyses being useful to generate a mean estimate of prevalence. Heterogeneity is an important aspect in meta-analysis, and one way it is usually assessed is with the I2 statistic.

Objectives: To describe and evaluate the use of the I2 statistic to assess heterogeneity in meta-analysis of prevalence.

Methods: This is a work from the Prevalence Estimates Reviews – Systematic Review (PERSyst) Methodology Group. We searched PubMed using the terms ‘prevalence’ and ‘systematic review’ in the title, from February 2017 to February 2018. We included SR on the prevalence of any clinical conditions published in English. If the SR conducted a meta-analysis, we extracted data regarding the assessment of heterogeneity. For the analysis, we classified the I2 as high (>50%) or low (≤50%). The Mann-Whitney test was then used to assess the association between the I2 result and the number of studies included in each meta-analysis.

Results: We included 235 SRs; 152 performed meta-analysis, and 144 assessed heterogeneity through I2, according to the description of their methodology. However, only 134 presented the I2 result for their main meta-analysis. The median I2 was 96.9% (interquartile range [IQR] 90.5 to 98.7). Seven meta-analyses (5%) presented I2 ≤50%; 3 (2%) presented I2 from 50% to 70%; and 124 (93%) presented I2 >70%. Of note, 102 meta-analyses (76%) presented I2 higher than 90%. There was an association between the number of studies included in the meta-analysis and the level of I2: meta-analyses with I2 >50% included more studies (median 19, IQR 10-28) than meta-analyses with I2 ≤50% (median 9, IQR 6.5 to 9.5; p = 0,004). All meta-analyses with more than 21 included studies presented I2 >50% (Table 1). Despite the high inconsistency observed, only 3 (2%) SRs reported prediction intervals.

Conclusions: Overall, meta-analyses of prevalence commonly present high inconsistency. This can be due to the nature of proportional data, where due to large datasets precise estimates are often provided, and small variance is observed even in studies with small sample size; this leads to minimal overlap of confidence intervals in these types of meta-analysis. Moreover, true heterogeneity is expected in prevalence estimates due to differences in the time and place where included studies were conducted. I2 statistics may not be discriminative and should be interpreted with caution in this case. Prediction intervals are a more conservative way to incorporate uncertainty in the analysis when true heterogeneity is expected; however, it is still underused in meta-analysis of prevalence. Whilst our study was limited to the evaluation of SRs of prevalence, we expect similar conclusions for reviews of other proportions (such as incidence).

Patient or healthcare consumer involvement: Prevalence estimates play a key role in supporting healthcare decision making. Understanding the underlying heterogeneity of this data is critical to decision making.