Methodological quality of systematic reviews in subfertility: a comparison of two different approaches

Article type
Authors
Popovich I1, Windsor B1, Jordan V2, Showell M3, Shea B4, Farquhar C5
1University of Auckland, New Zealand
2New Zealand Branch of the Australasian Cochrane Centre, Department Obstetrics and Gynaecology, Auckland University, New Zealand
3Cochrane Menstrual Disorders and Subfertility Group, Department of Obstetrics and Gynaecology, The University of Auckland, New Zealand
4CIETcanada, University of Ottawa, Canada
5Department of Obstetrics and Gynaecology and National Women’s Health, University of Auckland, New Zealand
Abstract
Background: Systematic reviews are used widely to guide health care decisions. Several tools have been created to assess systematic review quality. The AMSTAR tool applies a yes/no score to eleven relevant domains of review methodology. This tool has been reworked so that each domain is scored based on a four point scale, producing r-AMSTAR.

Objectives: To compare the AMSTAR and r-AMSTAR tools in the assessment of systematic reviews in the field of subfertility.

Methods: All published systematic reviews on assisted reproductive technology, with the latest search for studies taking place from 2007 to 2011, were considered. Reviews that contained no included studies or considered diagnostic outcomes were excluded. Thirty of each Cochrane and non-Cochrane reviews were randomly selected from a search of relevant databases. Both tools were then applied to all sixty reviews. The results were converted to percentage scores and all reviews graded and ranked based on this.

Results: Table 1 shows the breakdown of grades. AMSTAR produced a much wider variation in percentage scores and achieved higher inter-rater reliability than r-AMSTAR according to kappa statistics. The average rating for Cochrane reviews was consistent between the two tools but inconsistent for non-Cochrane reviews (63.9% R-AMSTAR vs. 38.5% AMSTAR). In comparing the rankings generated between the two tools Cochrane reviews changed an average of 4.2 places, compared to 2.9 for non-Cochrane.

Conclusions: r-AMSTAR provided greater guidance in the assessment of domains and produced quantitative and informative results. However, there were many problems with the construction of its criteria and AMSTAR was much easier to apply consistently. We recommend that AMSTAR incorporates the findings of this study and produces a revised tool that generates a more informative assessment of each domain, and gives greater guidance in doing so, while taking care to avoid the issues found with r-AMSTAR.
Images