Appropriateness of asymmetry tests for publication bias in meta-analysis: large-scale survey

2006 Dublin

Ioannidis J, Trikalinos T

Background: Statistical tests for funnel plot asymmetry are common in meta-analyses. Inappropriate application could generate misleading inferences.

Objectives: To evaluate how often asymmetry tests would be appropriate to apply in meta-analyses.

Methods: We evaluated all 6,873 non-identical meta-analyses of binary outcomes with >=3 studies in the Cochrane Database of Systematic Reviews (2003, Issue 2). A separate analysis selected the largest meta-analysis per review (n=846 meta-analyses). In each meta-analysis, we assessed heterogeneity, number of studies, ratio of the maximal to minimal variance across studies and availability of any study with statistically significant results, to decide the appropriateness of applying asymmetry tests. We applied a correlation and two regression asymmetry tests and evaluated their concordance. Finally, we sampled 60 meta-analyses from print journals in 2005 citing use of the standard regression asymmetry test.

Results: Only 366 of 6,873 meta-analyses (5%) would qualify for use of asymmetry tests based on stringent criteria (heterogeneity I2<50%, non-significant Q statistic, >=10 studies, ratio of extreme variances >4, >=1 significant study). With more lenient definitions (I2<50%, any Q, >=5 studies, variances' ratio >2, >=1 significant study), 1,454 meta-analyses (21%) would qualify. Respective percentages were 12% and 33% for the largest meta-analysis per systematic review. Asymmetry tests were significant only in 7-18% of the meta-analyses. Kappa coefficients among them were modest (0.33-0.66 and 0.33-0.64 in the two data sets, respectively). Of 60 journal meta-analyses, 53 (88%) and 45 (75%) failed the stringent and lenient criteria, respectively; all 11 claims for identifying publication bias were made in the face of large and significant heterogeneity.

Conclusions: Asymmetry tests are appropriate only in few meta-analyses and their performance has only modest concordance and unknown reliability. Publication and related biases should be addressed at their root rather than probed retrospectively.