Machine-learning assisted screening increases efficiency of systematic review

Article type
Authors
Qureshi R1, Robinson K2, Butler M3, Agai E4
1University of Colorado Anschutz Medical Campus
2Johns Hopkins School of Medicine
3University of Minnesota School of Public Health
4PICO Portal
Abstract
Background: Conventional systematic review (SR) methods are time-consuming and highly resource intensive. Artificial intelligence (AI) algorithms such as machine learning and deep learning can help reviewers complete these tasks in less time and with fewer resources. PICO Portal (PP) is an AI-assisted SR platform that prioritizes articles for screening using several algorithms including both decision tree and deep learning models.

Objectives: To assess the AI-assisted screening in PICO Portal.

Methods: Our data set comprised eight completed SRs, each using two independent screeners, with a total of 56,728 records (range: 4,204 to 14,185) on a range of topics from social to biomedical sciences. For each SR, we simulated the screening using batches of 100 articles to train and build predictions for eligibility, re-ranking successive articles, and comparing the predicted eligibility with the actual results from the SRs. We plotted the proportions of title/abstract and full-text included records that were captured by the AI screening at the title and abstract level for each project and calculated a weighted average of this efficiency (by project size). We meta-analyzed the sensitivity and specificity of the predictions versus the reviewers’ final decisions using Stata ‘metadta’.

Results: We estimate that if the active learning AI predictions had been used, reviewers would have needed to screen only 20-50% of title/abstracts to capture 95% of eligible records (Figure 1). After screening 10%, 25%, 50%, and 70% of title/abstract records, the average project would have captured approximately 60%, 85%, 95%, and 99% of the records included in the title/abstract stage (Figure 2). Sensitivity was better than specificity (95% vs. 68%) (Figures 3 and 4).

Conclusions: Based on our analysis, we estimate that 40-60% of screening effort can be saved using PICO Portal, an AI-assisted, web-based, SR platform. Future research should examine the impact of missing the final 5% of records on review conclusions and assess the resource-benefit ratio.

Patient relevance and involvement: Our findings and future recommendations are from the researcher and funder perspective. Our conclusions directly impact the amount of time reviewers need to complete an SR. This work did not involve any stakeholders, patients, or consumers.