Assessing risk of bias in non-randomised studies and incorporating GRADE: Initial experience with a new Cochrane 'Risk of bias' tool under development

MacLennan S1, Imamura M1, Dahm P2, Neuberger M2, Reeves B3, MacLennan G4, Omar M1, McClinton S5, Griffiths L6, N’Dow J7
1Academic Urology Unit, University of Aberdeen, Aberdeen, UK, 2Department of Urology, College of Medicine, University of Florida, Gainesville, Florida, USA, 3Faculty of Medicine and Dentistry, University of Bristol, UK, 4Health Services Research Unit, University of Aberdeen, UK, 5Urology Department, NHS Grampian, Aberdeen Royal Infirmary, Aberdeen, UK, 6Department of Cancer Studies and Molecular Medicine, University of Leicester, Clinical Sciences Unit, Leicester General Hospital, Leicester, UK, 7Urology Department, NHS Grampian, Aberdeen Royal Infirmary, and Academic Urology Unit, University of Aberdeen, Aberdeen, UK

Background: In instances where randomised controlled trials (RCT) are impossible or have not been conducted, clinical recommendations and decision-making must rely on other evidence. If systematic reviewers decide to include non-randomised studies (NRS), it is imperative to use a standard method to assess and communicate the risk of bias (RoB) in NRS.

Objectives: To pilot a RoB tool for NRS and make it commensurate with GRADE.

Methods: An extended version of the Cochrane RCT RoB tool was applied to NRS. This included an additional item on the risk of findings of an NRS being explained by confounding. Each pre-specified confounding factor was assessed on the precision of measurement, baseline imbalance, and quality of case-mix adjustment, on 5-point scales. Imbalance was judged by clinical consensus, while other items were assessed by two independent reviewers. Mean 'adjustment’ scores per outcome across studies were used to determine the quality of evidence according to GRADE. The tool was applied to 33 NRS retrieved for a systematic review of surgical interventions for localised renal cancer.

Results: The initial 5-point scale was unwieldy and lead to disagreement among reviewers. We created scoring guidelines and re-piloted. RoB scores were tabulated rather than aggregated to indicate where likely biases were located. All NRS were rated as either 'low’ or 'very low’ on GRADE; however, determining an appropriate cut-off required considerable judgement.

Conclusions: Compared with RoB assessment in RCT, assessment of NRS was more difficult and increased required time and expertise resources. In areas where the quality of studies is known to be very low, the added time and complexity may make the assessment not worthwhile. Presentation of the large amount of information generated by this tool is challenging. Further research needs to strike a balance of making a 'brief’ and 'easy’ version while addressing complex methodological issues inherent in NRS.