Making research easier to find: improving the discoverability of clinical trial protocols in Wellcome Open Research using Cochrane Linked Data

Article type
Authors
Elliott J1, Ali A2, Alam S3, Thomas J4, Mavergames C2, Lawrence R3
1Cochrane Australia
2Cochrane
3F1000 Group
4University College London
Abstract
Background:
A significant amount of the work involved in producing systematic reviews is spent screening abstracts and articles that are irrelevant to the review. This could be reduced with better descriptors (‘metadata’) of research outputs, improving the ‘signal to noise’ ratio in the results of searches for systematic reviews.

Objectives:
To assess the feasibility of using Cochrane Linked Data to attach structured PICO (Population, Intervention, Comparator, Outcome) metadata to clinical trial protocols published in Wellcome Open Research.

Methods:
Wellcome Open Research is an open access publication platform for Wellcome-funded research, operated by F1000. Wellcome encourages the publication of trial protocols to improve the transparency of research. Cochrane Linked Data is a set of technologies for the annotation of systematic review content (reviews, studies, analyses) using widely-adopted controlled vocabularies (metadata) in a structured PICO micrograph.

Cochrane Linked Data vocabularies were used to annotate trial protocols being published in Wellcome Open Research. Trial protocols submitted for publication were processed by Cochrane’s PICO Classifier, a text mining/machine learning system, which used protocol text to suggest controlled vocabulary terms relevant to the topic of the protocol. The suggested terms were sent to trial protocol authors for review and verification/modification. Final vocabulary terms were published within the article as text with an underlying hyperlink to the Cochrane Linked Data vocabulary browser.

Results:
Prototype systems have been developed and piloted for the workflow described above. The results of this pilot work will be presented, including the accuracy of the machine learning classifications, the responses of the trial protocol authors to machine-generated terms, and the feasibility of integrating this workflow into the Wellcome Open Research submission system.

Conclusions:
This is the first project to improve the discoverability of trial protocols with linked data (‘semantic web’) approaches. These technologies have the potential to substantially improve the discoverability and use of research, including trial protocols, trial registration and publications of trial findings.

Patient or healthcare consumer involvement:
Patients and consumers were not involved in this work.