Using Microsoft Academic Graph to maintain Cochrane Incontinence Specialised Registers of controlled trials and economic evaluations

Article type
Authors
Shemilt I1, Wallace S2, Thomas J1, Elstub L2, Johnson E2, Vale L2
1EPPI-Centre, UCL Social Research Institute
2Cochrane Incontinence, Population Health Sciences Institute, Newcastle University
Abstract
Background: Incorporating economic evidence into Cochrane Intervention Reviews (CIRs) entails the use of specialised search methods to identify relevant economic evaluations (EEs), which is a barrier to scaling up production of formal economics components of such reviews. Maintaining Cochrane Review Group (CRG) Specialised Registers (SRs) is also labour intensive, involving regular searches of multiple databases. Microsoft Academic Graph (MAG) is an online repository and dataset comprising >230 million open access bibliographic records of research articles, connected in a large network graph of conceptual and citation relationships. Its size, coverage, features and network graph structure make MAG potentially useful as a single source for study identification when updating SRs and collections of CIRs.

Objectives: To develop and evaluate semi-automated MAG workflows for establishing and maintaining a new CRG SR of EEs alongside a pre-existing CRG SR of controlled trials.

Methods: We attempted to match 9,133 study reports from the Cochrane Incontinence SR of controlled trials in October 2019 – encompassing 1,614 included studies across 83 CIRs – to corresponding MAG records. 196 reports of full EEs were also identified among these records with the help of machine learning. A retrospective simulation study was conducted to assess the performance of a MAG workflow in an ‘automated update’ scenario for: (i) allocating studies from the Cochrane Incontinence SR of controlled trials to the ‘correct’ CIR(s); and (ii) identifying full EEs.

Results: 5,351 study reports (58% of the Cochrane Incontinence SR) reporting 1,356 included studies (84%) and including 120 reports of full EEs (61%), were successfully matched to MAG records. Of 3,782 study reports that could not be matched, most were trial registry records (49%) or conference abstracts (44%). When used for allocating MAG-matched studies to CIRs the MAG workflow achieved precision of 0.19 at 99% recall and precision of 0.83 at 95% recall (Figure 1). However, the same workflow performed badly when used for identifying economic evaluations (Figure 2).

Conclusions: Novel semi-automated MAG workflows are being developed to help maintain Cochrane Incontinence SRs and reviews, aiming to make the process of incorporating economic evidence into CIRs more efficient. The ability of a MAG workflow to accurately assign records to the ‘correct’ CIRs is already good. However, automatically identifying economic evaluations is more challenging and further work is underway to improve performance in this regard. Findings from further analyses of the performance of MAG workflows when used to prospectively identify new reports of controlled trials and EEs will also be reported.

Patient or healthcare consumer involvement: This is a methods research study with no direct patient or healthcare consumer involvement. However, its findings are expected to help that ensure systematic reviews can include more relevant evidence and can more easily be kept up to date, with indirect benefits for all patients.