Machine Learning Reduced Workload for Covid-19 Literature

Article type
Authors
Kothari K1, Storr A2, Thomas J3
1CONSULTANT to the World Health Organization, Consultant to Library & Digital Information Networks, World Health Organization, Geneva, Switzerland
2Cochrane Central Executive Team, London, United Kingdom
3EPPI Centre, UCL Social Research Institute, University College London, London, United Kingdom
Abstract
Background: This study developed, calibrated, and evaluated a machine learning (ML) classifier designed to reduce citation screening workload for COVID-19 research. Even though this classifier was initially developed for the WHO COVID-19 Research Database, a comprehensive database of COVID-19 literature, it has long-term applications for systematic reviews on COVID-19.
Objectives: To build a machine learning classifier to help identify records for inclusion/exclusion in the WHO COVID-19 Research Database.
Methods: In order to train this classifier title-abstract records which were included or excluded in the WHO Covid-19 Research Database from 2020-2022 were used. An initial dataset of 6,931 citations and a second dataset of 4,329 citations were used for training. These data were initially labeled by information specialists as well as a semi-automated process based on title keywords. The data were imported into EPPI-Reviewer, assigned to code sets, and used to train a logistic regression classifier using tri-gram ‘bag of words’ features. The classifier was then calibrated and internally validated using a third data set of 20,160 citations. A cut point below which records would not require manual screening was established. A further upper cut point for the database use case was established, above which records could be included in the database without manual screening. The classifier was then validated on 5 COVID-19 reviews retrospectively, covering a broad range of topics. This determined if the classifier was able to reduce the screening workload for a systematic review without excluding the finally included records.
Results: The “WHO-Cochrane-UCL COVID-19 Classifier” was trained using 11,260 records, out of which 6,939 citations were labeled for inclusion in the WHO COVID-19 Research database. A classification threshold was set using 20,160 records and the classifier was set to achieve a target recall of 99%. Using a lower and upper cut-point allowed for an estimated net screening workload reduction for the database use case of 75%. When tested against the final included citations identified for five COVID-19 reviews, the classifier achieved 100% recall at the pre-specified cut-point.
Conclusion: The WHO-Cochrane-UCL classifier reduces manual screening workload with a very low risk of missing eligible studies.