Article type
Abstract
Background: Machine-learning and citizen-science initiatives within Cochrane are already transforming Cochrane’s centralised efforts to identify reports of trials. The RCT machine classifier, which assigns a probability ranking to citations, can substantially reduce the screening workload while still retaining very high recall. In recent years, Cochrane Crowd collectively has identified many thousands of reports of trials. The challenge is to integrate these new approaches into routine workflows for systematic reviews.
Objectives: To evaluate the performance (accuracy and workload reduction) of the RCT Machine Classifier + Cochrane Crowd versus standard screening approaches in a series of case studies of individual Cochrane Reviews; to identify practicalities of introducing Crowd +/- Machine as a service for reviewers.
Methods: Several evaluations are under way involving reviews from Cochrane Consumers and Communication; Cochrane Developmental, Psychosocial and Learning Problems; and Cochrane living systematic review pilots. As part of the evaluations, citations retrieved by searches are ranked by the RCT Classifier (with pre-specified probability thresholds applied) and filtered against citations already screened by the Crowd (‘known assessments’). Previously unscreened citations are then sent to the Crowd for assessment. In parallel, the performance of Classifier + Crowd is compared to various combinations of manual screening.
Results: The pilots are ongoing. However, in one, 89% of citations retrieved from Embase were citations that had already been through Cochrane’s Crowd-Machine systems and assigned relevant study-design classifications. In the other pilots where the Crowd is performing prospective screening for specific reviews, interim results show that Crowd can reduce the number of citations authors need to screen by as much as 80%, representing several thousands of citations, all within days of being sent to the Crowd.
Conclusions: Machine and crowd approaches have proven successful in improving efficiencies for centralised trial-searching activities and offer the prospect of similar efficiencies when implemented at the review level.
Objectives: To evaluate the performance (accuracy and workload reduction) of the RCT Machine Classifier + Cochrane Crowd versus standard screening approaches in a series of case studies of individual Cochrane Reviews; to identify practicalities of introducing Crowd +/- Machine as a service for reviewers.
Methods: Several evaluations are under way involving reviews from Cochrane Consumers and Communication; Cochrane Developmental, Psychosocial and Learning Problems; and Cochrane living systematic review pilots. As part of the evaluations, citations retrieved by searches are ranked by the RCT Classifier (with pre-specified probability thresholds applied) and filtered against citations already screened by the Crowd (‘known assessments’). Previously unscreened citations are then sent to the Crowd for assessment. In parallel, the performance of Classifier + Crowd is compared to various combinations of manual screening.
Results: The pilots are ongoing. However, in one, 89% of citations retrieved from Embase were citations that had already been through Cochrane’s Crowd-Machine systems and assigned relevant study-design classifications. In the other pilots where the Crowd is performing prospective screening for specific reviews, interim results show that Crowd can reduce the number of citations authors need to screen by as much as 80%, representing several thousands of citations, all within days of being sent to the Crowd.
Conclusions: Machine and crowd approaches have proven successful in improving efficiencies for centralised trial-searching activities and offer the prospect of similar efficiencies when implemented at the review level.