Can Cochrane's machine learning classifier increase the efficiency of guideline production?

Article type
Authors
McDonald S1, Turner T1, Hill K2, Elliott J1, Thomas J3
1Cochrane Australia
2Stroke Foundation
3University College London
Abstract
Background:
Keeping systematic reviews up-to-date requires substantial human resource. The manual screening of citations to identify relevant records for inclusion is one of the most laborious tasks. Cochrane has recently developed a machine learning classifier (RCT Classifier) that accurately distinguishes between randomised and non-randomised studies, reducing screening load and greatly increasing the efficiency of review production. It is likely that the RCT Classifier will also be useful in guideline development, where the same screening task is undertaken across many clinical questions.

Objective:
To evaluate the accuracy and utility of the RCT Classifier for use in guideline development by conducting a retrospective analysis of citations of randomised trials included in the Australian Stroke Foundation Clinical Guidelines for Stroke Management in 2017.

Methods:
The Stroke Guidelines cover eight topic areas, comprising 89 clinical questions and about 300 PICO questions. Searches of the major bibliographic databases across all clinical questions yielded over 109,000 citations; these were dual screened by members of the various guideline working groups. The citations to reviews and randomised trials included in the guidelines are being collated separately. Once all the RCTs have been identified, we will run these through the RCT Classifier, which has been calibrated to achieve a high recall, and record the probability score for each citation, noting how many studies would have been 'lost' if those studies that the classifier deemed not to be RCTs were excluded. We will also estimate the likely time-saving for guideline production had the classifier been available.

Results:
The 89 clinical questions include over 500 citations to reviews, randomised trials and observational studies. The performance of the RCT Classifier in accurately identifying the included trials is underway and the results will be presented.

Conclusions:
Incorporating new research into the existing evidence base is time-consuming and costly. Machine learning approaches to study identification and prospective evidence surveillance have the potential to substantially reduce the time and cost of evidence production, thus enabling a more living evidence model that ultimately accelerates benefits to patients.

Patient or healthcare consumer involvement:
Not applicable.