Assessing risk of bias in non-randomised studies and incorporating GRADE: Initial experience with a new Cochrane 'Risk of bias' tool under development

2011 Madrid

MacLennan S¹, Imamura M¹, Dahm P², Neuberger M², Reeves B³, MacLennan G⁴, Omar M¹, McClinton S⁵, Griffiths L⁶, N’Dow J⁷

¹Academic Urology Unit, University of Aberdeen, Aberdeen, UK

²Department of Urology, College of Medicine, University of Florida, Gainesville, Florida, USA

³Faculty of Medicine and Dentistry, University of Bristol, UK

⁴Health Services Research Unit, University of Aberdeen, UK

⁵Urology Department, NHS Grampian, Aberdeen Royal Infirmary, Aberdeen, UK

⁶Department of Cancer Studies and Molecular Medicine, University of Leicester, Clinical Sciences Unit, Leicester General Hospital, Leicester, UK

⁷Urology Department, NHS Grampian, Aberdeen Royal Infirmary, and Academic Urology Unit, University of Aberdeen, Aberdeen, UK

Background: In instances where randomised controlled trials (RCT) are impossible or have not been conducted, clinical recommendations and decision-making must rely on other evidence. If systematic reviewers decide to include non-randomised studies (NRS), it is imperative to use a standard method to assess and communicate the risk of bias (RoB) in NRS.

Objectives: To pilot a RoB tool for NRS and make it commensurate with GRADE.

Methods: An extended version of the Cochrane RCT RoB tool was applied to NRS. This included an additional item on the risk of findings of an NRS being explained by confounding. Each pre-specified confounding factor was assessed on the precision of measurement, baseline imbalance, and quality of case-mix adjustment, on 5-point scales. Imbalance was judged by clinical consensus, while other items were assessed by two independent reviewers. Mean 'adjustment’ scores per outcome across studies were used to determine the quality of evidence according to GRADE. The tool was applied to 33 NRS retrieved for a systematic review of surgical interventions for localised renal cancer.

Results: The initial 5-point scale was unwieldy and lead to disagreement among reviewers. We created scoring guidelines and re-piloted. RoB scores were tabulated rather than aggregated to indicate where likely biases were located. All NRS were rated as either 'low’ or 'very low’ on GRADE; however, determining an appropriate cut-off required considerable judgement.

Conclusions: Compared with RoB assessment in RCT, assessment of NRS was more difficult and increased required time and expertise resources. In areas where the quality of studies is known to be very low, the added time and complexity may make the assessment not worthwhile. Presentation of the large amount of information generated by this tool is challenging. Further research needs to strike a balance of making a 'brief’ and 'easy’ version while addressing complex methodological issues inherent in NRS.