Assessing the Quality of Reports of Systematic Reviews and Meta-analyses: A Systematic Review of Checklists and Scales

1999 Rome

Shea B, Dube C, Moher D

Introduction: Readers need to be confident that the results from meta-analyses (MAs) are as free as possible of bias. MAs with minimal bias are more likely to be valid and to be more widely disseminated into effective health care practice and policy. One way to assess the merits of MAs is to assess the validity of their reports.

Objectives: We set out to identify and appraise instruments developed to assess the quality of reports of Systematic Reviews (SRs) and MAs.

Methods: We searched MEDLINE up to and including February 1999. The potentially relevant articles (n=318) were reviewed independently by all three authors. Inclusion criteria were that the instrument: be either a scale or a checklist; be designed for the purpose of assessing the quality of SRs and MAs; and reported in any language. The authors a priori agreed upon definitions of checklists and scales. The instruments were first assessed for the following: number of items, type of quality assessed, inclusion of an explicit statement regarding the purpose of the tool, and the amount of time required to complete the tool. Each instrument was then compared to an evidence-based standard. Finally, using a convenience sample of 4 MAs, we assessed the stability of quality assessment among selected instruments.

Results: Twenty-six instruments were included in our review: 23 checklists and 3 scales. When compared to an evidence-based standard, none of the instruments included all recommended items. The majority of instruments contained items about what the methods sections of MAs should include and generally neglected the other components of the report. Only one included an item regarding the title and two addressed the abstract. Fifty, possibly 58% of the instruments included an item about the introduction. No instrument suggested presenting a trial flow diagram in the results. In contrast, the majority of instruments included an item about the description of study characteristics (73%, possibly 85%), or about the quantitative data synthesis (77%, possibly 85%). Sixty, possibly 85% of the instruments included an item about the discussion. Three of the four selected MAs used were paper based while the fourth one was a Cochrane review. The quality of the report of each MA was fairly stable across instruments. The quality ranged from 26% to 34% of the maximum possible value. The Cochrane review reported the highest quality regardless of the instrument used. The rank ranges were stable across the different instruments used, with the Cochrane review reporting the highest rank across all four instruments.

Discussion: Approximately two dozen instruments have been developed to assess various aspects of SRs and MAs. When compared to an evidence-based standard, these instruments are generally incomplete and inconsistent. However, quality assessment was fairly stable across selected instruments. A convenience sample of four MAs scored relatively low quality scores. This was not true of the Cochrane review.