Application of the GRADE methodology to Cochrane diagnostic test accuracy reviews

2012 Auckland

Gopalakrishna G¹, Mustafa R², Davenport C³, Scholten R⁴, Hyde C⁵, Brozek J², Schünemann H², Langendam M⁴, Leeflang X¹, Bossuyt P¹

¹Department of Clinical Epidemiology, Academic Medical Centre, The Netherlands

²Department of Clinical Epidemiology and Biostatistics, McMaster University, Canada

³Public Health, Epidemiology and Biostatistics, University of Birmingham, UK

⁴Dutch Cochrane Centre, The Netherlands

⁵Peninsula College of Medicine and Dentistry, University of Exeter, UK

Background: The GRADE criteria for rating evidence can also be used for evaluating the results of diagnostic accuracy studies, but more experience is needed.

Objectives: We applied the GRADE methodology to published Cochrane diagnostic test accuracy reviews (DTAR) with the aim of better understanding the application of the GRADE methodology to DTARs.

Methods: We selected three DTARs based on diversity of clinical areas and methodological issues. At least three reviewers with expertise in the GRADE approach and/or in diagnostic test accuracy methodology independently rated the evidence in each review according to the five ‘GRADE domains’. Reviewers strived to explain each judgment made by documenting all considerations. Two teleconferences were held to exchange experiences.

Results: Table 1 summarizes the main issues discussed. We observed that some reviewers assessed the evidence from the perspective of patient important outcomes while others assessed the evidence from an accuracy standpoint. Having a clear definition of the clinical question before starting the grading process was particularly pertinent in DTARs comparing multiple index tests and those including different patient spectrums. There was no consensus on the criteria and thresholds to use when assessing the GRADE domains ‘inconsistency’, ‘imprecision’ and ‘publication bias’. When applying GRADE to the comparative DTAR that did not directly compare two index tests, it was challenging to summarise the effect estimates and make judgments about the quality of evidence for both tests in a single GRADE table.

Conclusions: The perspective from which test accuracy evidence is graded can influence the judgment of its quality. Worked examples illustrating the application of GRADE domains: ‘inconsistency’, ‘imprecision’ and ‘publication bias’, in particular, will facilitate the operationalization of GRADE for diagnostics. Explicit guidance on how to rate the evidence in a comparative test review where two tests are not directly compared is needed.

Images

P15.jpg