Another decision to evaluate: choice of standardized mean difference effect size estimator

1998 Baltimore

Cannella KAS

The standardized mean difference is the effect size generally recommended in clinical trials and other studies assessing treatment effects on outcomes measured on a continuous scale. Initially, the standard deviation of the control group, recommended for use in meta-analysis by Glass, was the standard deviation of choice. Hedges argued that using the pooled standard deviation to standardize the mean differences resulted in a more precise effect size estimate. It does require making the assumption that the population variances of the two groups are equal. If this assumption does not seem warranted, the standardized mean difference effect size can be computed using the standard deviation of one of the two groups under comparison.

Glass argued that a treatment with an effect on the outcome of interest, especially a clinically important effect, might sufficiently differentiate the two groups such that they could no longer be considered to constitute one common population. In this case, the mean difference may be standardized using the standard deviation of either the control or treatment group. Since one unintentional consequence of the treatment effect may be the reduced variability of the treatment group on the outcome being measured, standardizing the mean difference using the standard deviation of the control group may provide a more conservative effect size estimator. Using the standard deviation of the control group to estimate the effect size also makes logical sense since the control group, whatever its type (untreated, placebo, usual care, etc.), acts as a reference group to which the treatment group is compared.

Empirical evidence was collected to illustrate the potential impact of this decision. Effect sizes from a convenience sample of randomized controlled trials were calculated using the Effect Size Calculator software program to compute biased and unbiased effect sizes using both the pooled standard deviation and that of the control group to standardize the mean differences. There was considerable variability in the difference between control group and pooled standard deviations within and across studies. For instance, in one study the difference in these standard deviations ranged from -1.37 to 3.91 across different outcome measures. While the impact of choice of effect size estimator may have been minimal in most cases, in a few cases the difference was large enough, 0.067 and 0.083, that it could influence interpretation of the estimate.

Since the true situation is estimated, the correct choice of effect size estimator cannot be known. However the impact of this decision can be evaluated by conducting a sensitivity analysis using the alternate standardized mean difference effect size estimator. Findings of minimal or no differences in effect size estimates using an alternate standard deviation to standardize the mean differences would enhance the credibility of the findings. In contrast, findings that choice of estimator resulted in substantial variation in effect size estimates would warrant further investigation. Additional exploration of these differences is recommended.