Article type
Year
Abstract
Background: Screening abstracts for eligibility is a vitally important, but tedious step in the systematic review process. Typically, electronic searches for a review yield several thousands of abstracts, which are then perused by the reviewers and either excluded from or included in the review according to some predefined criteria. This is a laborious, expensive process. Objectives: To reduce the burden on researchers conducting systematic reviews by applying machine learning (ML) techniques to expedite abstract screening. In particular, to build a classification model from a manually classified subset of the entire corpus of citations retrieved via the search strategy and then use this model to automatically include or exclude the remaining citations. Methods: For our classification model we use the Support Vector Machine (SVM), a state-of-the-art algorithm well suited to textual data. To expedite the training of the model – and thereby reduce the reviewers’ workload – we use a technique known as active learning, in which the expert interactively trains the SVM by providing labels for abstracts sequentially (i.e., designating them as ‘eligible’ or ‘ineligible’). The relatively low prevalence of eligible abstracts and the caveat that a semi-automated screening process for systematic reviews must not wrongly exclude any relevant abstracts presents novel challenges for ML algorithms. We have developed new methods that address these challenges. Results: We ran experiments on several datasets from previously conducted reviews. Simulating active learning, we show that our method can reduce workload by 40 to 50%, without wrongly excluding any relevant studies. Conclusions: Our preliminary work indicates that the burden on researchers conducting systematic reviews can be reduced substantially without sacrificing thoroughness.