Background: Publication and related bias in meta-analysis is often examined by checking for asymmetry in funnel plots of the treatment effect against its standard error. Formal statistical tests of funnel plot asymmetry have been previously proposed (Begg & Mazumdar, Biometrics 1994, Egger et al, BMJ 1997). However, when applied to binary outcome data these tests can give false-positive rates that are higher than the nominal level in some situations (large treatment effects, or few events per trial, or all trials of similar sizes). A suggested alternative (Macaskill et al, Statistics in Medicine 2001) may have a lower false-positive rate but at the expense of lower power.
1. To develop a modified linear regression test that has well-controlled false-positive rate in a wider range of situations than the test first proposed by Egger et al., while retaining reasonable power.
2. To assess its statistical properties in meta-analyses typical of controlled clinical trials, including between-study heterogeneity
3. To compare its properties to those of existing tests.
Methods: We suggest a modified version of Egger's linear regression test based on the efficient score and its variance (Fisher's information). The performance of this test is compared to the other proposed tests in simulation analyses based on the characteristics of published controlled trials, including varying degrees of between-study heterogeneity.
Results: When there is little or no between-trial heterogeneity, the modified test has a false-positive rate close to the nominal level while maintaining similar power to the linear regression test. When the degree of between-trial heterogeneity is large (between-study variance in log-odds ratio greater than around 0.04), none of the tests has uniformly good properties. The power of all tests is low unless the number of studies is large.
Conclusions: The modified test shows good properties in a wider range of circumstances than existing tests of small-study effects in meta-analyses of controlled trials with binary endpoints. However, none of the tests should be used on studies with binary endpoints when there is a moderate or large amount of between-study heterogeneity in treatment effects.