Methodological quality of systematic reviews in subfertility: a comparison of two different approaches

2012 Auckland

Popovich I¹, Windsor B¹, Jordan V², Showell M³, Shea B⁴, Farquhar C⁵

¹University of Auckland, New Zealand

²New Zealand Branch of the Australasian Cochrane Centre, Department Obstetrics and Gynaecology, Auckland University, New Zealand

³Cochrane Menstrual Disorders and Subfertility Group, Department of Obstetrics and Gynaecology, The University of Auckland, New Zealand

⁴CIETcanada, University of Ottawa, Canada

⁵Department of Obstetrics and Gynaecology and National Women’s Health, University of Auckland, New Zealand

Background: Systematic reviews are used widely to guide health care decisions. Several tools have been created to assess systematic review quality. The AMSTAR tool applies a yes/no score to eleven relevant domains of review methodology. This tool has been reworked so that each domain is scored based on a four point scale, producing r-AMSTAR.

Objectives: To compare the AMSTAR and r-AMSTAR tools in the assessment of systematic reviews in the field of subfertility.

Methods: All published systematic reviews on assisted reproductive technology, with the latest search for studies taking place from 2007 to 2011, were considered. Reviews that contained no included studies or considered diagnostic outcomes were excluded. Thirty of each Cochrane and non-Cochrane reviews were randomly selected from a search of relevant databases. Both tools were then applied to all sixty reviews. The results were converted to percentage scores and all reviews graded and ranked based on this.

Results: Table 1 shows the breakdown of grades. AMSTAR produced a much wider variation in percentage scores and achieved higher inter-rater reliability than r-AMSTAR according to kappa statistics. The average rating for Cochrane reviews was consistent between the two tools but inconsistent for non-Cochrane reviews (63.9% R-AMSTAR vs. 38.5% AMSTAR). In comparing the rankings generated between the two tools Cochrane reviews changed an average of 4.2 places, compared to 2.9 for non-Cochrane.

Conclusions: r-AMSTAR provided greater guidance in the assessment of domains and produced quantitative and informative results. However, there were many problems with the construction of its criteria and AMSTAR was much easier to apply consistently. We recommend that AMSTAR incorporates the findings of this study and produces a revised tool that generates a more informative assessment of each domain, and gives greater guidance in doing so, while taking care to avoid the issues found with r-AMSTAR.

Images

7_2.jpg