Agreement in the assessment of systematic reviews with the AMSTAR 2 tool: novice versus expert methodologists

2019 Santiago

Martinez-Zapata MJ¹, Niño de Guzman E², Canelo C³, Vasquez-Mejia A⁴, Merchan-Galvis A⁵, Madera-Anaya M⁶, Viteri-Garcia A⁷, Muñoz P⁸, Zaror C⁸

¹Iberoamerican Cochrane Centre, Biomedical Research Institute Sant Pau (IIB Sant Pau), CIBER Epidemiología y Salud Pública (CIBERESP)

²Iberoamerican Cochrane Centre, Biomedical Research Institute Sant Pau (IIB Sant Pau). Universidad Autónoma de Barcelona

³Iberoamerican Cochrane Centre. CIBER Epidemiología y Salud Pública (CIBERESP). Universidad Autónoma de Barcelona

⁴Universidad Nacional Mayor de San Marcos, Lima

⁵Iberoamerican Cochrane Centre, Biomedical Research Institute Sant Pau (IIB Sant Pau). Universidad Autónoma de Barcelona, Spain. Universidad del Cauca, Popayán

⁶Iberoamerican Cochrane Centre. Universidad Autónoma de Barcelona, Spain. Universidad de Cartagena, Cartagena

⁷Cochrane Ecuador. Centro de Investigación en Salud Pública y Epidemiología Clínica (CISPEC). Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito

⁸Department of Pediatric Dentistry and Orthodontic, Faculty of Dentistry, Universidad de la Frontera, Temuco

Background: one of the tools to assess the quality of conducting systematic reviews is the AMSTAR2 (A MeaSurement Tool to Assess systematic Reviews). We studied the agreement of AMSTAR2 between novice and expert methodologists assessing the quality of systematic reviews (SRs) addressing the effectiveness and safety of restrictive versus liberal blood transfusion.

Objectives: to check if the AMSTAR2 tool is intuitive enough for its use and interpretation, and there is agreement in the majority of items scored on AMSTAR2 by novices and experts.

Methods: a literature search was conducted in MEDLINE and the Cochrane Library comparing restrictive red blood cell transfusion versus liberal transfusion published between January 2016 and December 2017. Two independent reviewers conducted the search and extracted data from included studies. We evaluated the methodological quality of the SRs using the new AMSTAR2 tool, following the recommendations of its authors. Each study was evaluated by two reviewers (novice and expert pair; in total 4 peers). We calculated the kappa coefficient for each of the 16 AMSTAR2 questions (Q).

Results: the 49 papers identified by the search included 19 SRs: 11 only analysed RCTs, and eight SRs analysed both RCTs and observational studies. Ten SRs were conducted on surgical patients, and nine on medical conditions; 18 SRs focused on adult populations, and one SR included neonates.

As we can we see in Table 1, the Kappa coefficient was:

- excellent (0.81-1) for question 1 (Q1) (PICO definition (Population, Intervention, Comparator, Outcomes));

- acceptable (0.41-0.60) for Q2 (review methods), Q8 (description of the included studies), Q11 (meta-analysis and methods for the statistical combination of results), Q12 (meta-analysis and the potential impact of risk of bias (RoB) of individual studies on the results), Q13 (account for RoB in individual studies when interpreting or discussing the results of the review), Q14 (explanation and discussion of heterogeneity observed in the results), Q15 (meta-analysis and publication bias discussed regarding its possible impact on the review results);

- weak (0.21-0.40) for Q6 (performing data extraction in duplicate), Q9 (use of a satisfactory technique for assessing the RoB in individual studies), Q10 (reporting of sources of studies funding):

- poor (0.0-0.20) for Q4 (comprehensive literature search strategy), Q16 (conflict of interest); and

- bad (< 0) for Q3 (selection of the study designs for inclusion in the review), Q5 (performance of study selection in duplicate), Q7 (list of excluded studies and justification of exclusions).

Conclusions: the AMSTAR2 tool has a reasonable agreement between novice and expert methodologists for use in the evaluation of systematic reviews. However, training would improve the agreement for specific questions.

Patient or healthcare consumer involvement: none