How do systematic review users and producers interpret the stability of review findings based on GRADE quality of evidence ratings?

Article type
Authors
Thaler K1, Sommer I2, Dobrescu AI3, Swinson Evans T4, Lohr K4, Gartlehner G4
1Austrian Cochrane Branch, Austria
2Danube University Krems, Department for Evidence-based Medicine and Clinical Epidemiology, Austria
3Victor Babes University of Medicine and Pharmacy, Timisoara, Romania
4RTI International, Research Triangle Park, NC, USA
Abstract
Background:
The GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach uses information about study limitations, imprecision, inconsistency, indirectness, and publication bias to determine quality of evidence (QoE) and communicate the confidence that systematic reviewers have in the estimate of effect size. Semantically, key elements of QoE definitions include the concepts of truth, confidence (in effect estimates), modifiers of levels of confidence (e.g. very, moderately, or limited), and deficiencies.

Objectives:
Review authors (producers) and readers (users) may interpret terms intended to convey certainty or stability of results differently. We sought to determine the degree of stability of effects over time that review users and/or producers associate with QoE grades.

Methods:
In an anonymous web-based survey participants used an interactive graphical sliding scale (0% to 100%) to indicate their interpretation of the degree of certainty that future results would NOT substantially change the estimated effect given 'high', 'moderate' or 'low' QoE.

Results:
208 people provided data: 82 (39%) identified as producers, 49 (24%) as users, and 77 (37%) as both users and producers of systematic reviews (SRs). Overall, SR users and producers assigned similar likelihoods that treatment effects will remain stable (P value 0.29), although the variation of answers within groups was large. Fig 1 illustrates the ranges of responses and skewed results for high and low QoE in all three groups. For all groups combined, the mean (SD) for the “estimate that high QoE will remain stable as new studies emerge” was 86.0% (8.2). For moderate QoE the pooled estimate of stability was 61.0% (11.8); for low QoE 34.8% (14.5).

Conclusions:
This study shows that variability in the interpretation of GRADE QoE ratings exists; however the differing interpretation is not between users and producers of SRs. The wide range of associated likelihoods indicates a need for discussion about the meaning behind the definitions of QoE. Furthermore, future studies could test the predictive validity of the GRADE approach in real-world bodies of evidence.