Quality assessments of randomized controlled trials: an evaluation by the Chalmers versus the Jadad method

Article type
Year
Authors
Ohlsson A, Lacy JB
Abstract
Introduction: More than 25 scales to assess the quality of randomized controlled trials (RCTs) have been published (Moher D. et al. Control Clin Trials 1995;l6:62), but only one has been validated (range of score: 1 - 5) (Jadad A.R., D Phil Thesis, University of Oxford, 1994). One commonly used scale (range of score: 0.0 - 1.0) (Chalmers T.C. et al. Control Clin Trials 1981;2:31) has been criticized as no correlation between quality score and effect size was found in one study (Emerson J.D. et al. Control Clin Trials 1990;11:339).

Objective: The purpose of this study was to apply the Jadad method (developed for quality assessment of RCTs in pain relief) to 33 perinatal RCTs (that could have been double blinded), that we had previously assessed by the Chalmers method, and to evaluate the correlation between the two scores.

Methods: We obtained consensus quality scores on 51 perinatal RCTs using the Chalmers method as part of 4 critical overviews [Ohisson A. Am J Obstet Gynecol 1989;l 60:890, Ohisson A., Lacy J.B., J Cur Opinion Pediatr 1993:5:1142, Ohisson A., Myhr T. Am J Obstet Gynecol 1994;170:910, Lacy J.B., Ohisson A. Arch Dis Child 1995 (in press)]. The same trials (excluding 18 that could not be double blinded) were then scored independently by the 2 authors using the Jadad method. The quality scores assigned to the 33 trials using the Jadad method were compared between the 2 researchers using the intra-class correlation coefficient (ICC). In case of disagreement a final quality score was obtained by consensus. ICC had previously been calculated for the scores obtained in the 4 overviews using the Chalmers method. The Spearman Correlation Coefficient (SCC) was used to determine the correlation between the consensus quality scores for the 2 methods.

Results: In the 4 published overviews the ICC for agreement between the reviewers using the Chalmers method ranged from 0.83 - 0.98. The ICC using the Jadad method in this study was 0.96. The SCC was 0.84 (p=0.0001). Fourteen RCTs scored only 1 point on the Jadad scale; on the Chalmers scale their scores ranged from 0.25 - 0.66. We found the Jadad method to be much less time consuming than the Chalmers method - approximately 5 vs. 60 minutes for each quality assessment. We experienced difficulty in interpreting some of the instructions for both scales.

Discussion: As the SCC was high, both scales seem to be appropriate for quality assessments of perinatal trials that can be double blinded. Instructions for both scales need to be more explicit. Comparisons of quality scores obtained by either method to effect size should be made. Methods to evaluate the quality of RCTs that cannot be double blinded need to be developed.