Streamlining systematic reviews: Using machine learning to enhance screening efficiency

Article type
Authors
Wilson L1, Robinson K2
1Johns Hopkins Bloomberg School Of Public Health, Baltimore, MD, USA
2Johns Hopkins School of Medicine, Baltimore, MD, USA
Abstract
"Background: Conducting a systematic review (SR) is time-consuming and resource intensive. There is an increasing number of tools with options to use machine learning (ML)-assisted screening of search results. There is limited evidence about these tools.

Objectives: To assess ML-assisted screening, including accuracy and time saved.

Methods: We used ML-assisted screening provided in PICO Portal. Our data set comprised eight SRs conducted to support two evidence-based guidelines focused on treatment options for people with chronic kidney disease. For each SR, two reviewers started screening titles and abstracts independently. After the initial training set, citations were re-ranked daily based on the ML predictions for full-text eligibility. Reviewers stopped screening when the prediction of eligible for full-text reached a 95% recall. We pooled the sensitivity and specificity of the ML predictions versus the final set of included studies using Stata ‘metadta’. We estimated the time saved during screening by multiplying the average time to screen one abstract by the number of abstracts remaining after reaching the 95% recall. We plotted the proportions of title/abstract and full-text included citations that were captured by the ML screening at the title and abstract level for each project and calculated a weighted average of this efficiency (by project size).

Results: We uploaded 29,582 records into PICO Portal (range, 539 to 6704). The pooled sensitivity of the screening was 100% (95% confidence interval [CI], 90%-100%) and specificity was 50% (95% CI, 30%-60%) (Figure 1). Assuming 30-69 seconds are needed to screen each abstract, we estimate the amount of time saved per SR to be 3 to 43 hours (Table 1). Further, to capture 95% of studies eligible for full-text, we estimate that reviewers could have stopped screening earlier, at fewer than 60% of the titles/abstracts (Figure 2).

Conclusions: Using a tool, such as PICO Portal, with ML-assisted screening is accurate and reduces time needed for title/abstract screening. Further research is needed to determine if results vary across different types of review questions.

Relevance to patients: Our results could impact the amount of time needed to complete a SR and has relevance for researchers and funders."