Application of the GRADE methodology to Cochrane diagnostic test accuracy reviews

Article type
Authors
Gopalakrishna G1, Mustafa R2, Davenport C3, Scholten R4, Hyde C5, Brozek J2, Schünemann H2, Langendam M4, Leeflang X1, Bossuyt P1
1Department of Clinical Epidemiology, Academic Medical Centre, The Netherlands
2Department of Clinical Epidemiology and Biostatistics, McMaster University, Canada
3Public Health, Epidemiology and Biostatistics, University of Birmingham, UK
4Dutch Cochrane Centre, The Netherlands
5Peninsula College of Medicine and Dentistry, University of Exeter, UK
Abstract
Background: The GRADE criteria for rating evidence can also be used for evaluating the results of diagnostic accuracy studies, but more experience is needed.

Objectives: We applied the GRADE methodology to published Cochrane diagnostic test accuracy reviews (DTAR) with the aim of better understanding the application of the GRADE methodology to DTARs.

Methods: We selected three DTARs based on diversity of clinical areas and methodological issues. At least three reviewers with expertise in the GRADE approach and/or in diagnostic test accuracy methodology independently rated the evidence in each review according to the five ‘GRADE domains’. Reviewers strived to explain each judgment made by documenting all considerations. Two teleconferences were held to exchange experiences.

Results: Table 1 summarizes the main issues discussed. We observed that some reviewers assessed the evidence from the perspective of patient important outcomes while others assessed the evidence from an accuracy standpoint. Having a clear definition of the clinical question before starting the grading process was particularly pertinent in DTARs comparing multiple index tests and those including different patient spectrums. There was no consensus on the criteria and thresholds to use when assessing the GRADE domains ‘inconsistency’, ‘imprecision’ and ‘publication bias’. When applying GRADE to the comparative DTAR that did not directly compare two index tests, it was challenging to summarise the effect estimates and make judgments about the quality of evidence for both tests in a single GRADE table.

Conclusions: The perspective from which test accuracy evidence is graded can influence the judgment of its quality. Worked examples illustrating the application of GRADE domains: ‘inconsistency’, ‘imprecision’ and ‘publication bias’, in particular, will facilitate the operationalization of GRADE for diagnostics. Explicit guidance on how to rate the evidence in a comparative test review where two tests are not directly compared is needed.
Images