The WISEST AI Project: an artificial intelligence decision support tool to assess the quality/bias in systematic reviews

Article type
Authors
Bagheri E1, Kanjii S2, Lunny C3, Nazari T4, Pieper D5, Rad R1, Ridley B6, Shea B2, Sun K7, Tricco A7, WISEST AI Project Team 8
1Laboratory for Systems, Software and Semantics (LS3), Toronto Metropolitan University, Toronto, ON, Canada
2The Ottawa Hospital and Ottawa Health Research Institute, Ottawa, ON, Cannada
3University of British Columbia, Vancouver, BC, Canada
4Department of Medical Geriatrics, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
5Institute for Health Services and Health System Research, Faculty of Health Sciences Brandenburg, Brandenburg Medical School, Brandenburg, Brandenburg, Germany
6Li Ka Shing Knowledge Institute, St. Michael’s Hospital, Unity Health Toronto, Toronto, ON, Canada
7Li Ka Shing Knowledge Institute, St. Michael’s Hospital, Unity Health Toronto, Toronto, ON, Canada; Epidemiology Division and Institute of Health Policy, Management, and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
8University of British Columbia, Vancouver, BC, Canada; Li Ka Shing Knowledge Institute, St. Michael’s Hospital, Unity Health Toronto, Toronto, ON, Canada; Epidemiology Division and Institute of Health Policy, Management, and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
Abstract
Background: Evidence-based medicine (EBM) stipulates that all relevant and rigorous evidence should be used to make clinical, public health, and policy decisions. Systematic reviews (SRs) were developed as summaries of all available evidence to enable EBM. Critically appraising SRs is often done using tools like the AMSTAR 2 checklist for methodological quality and the ROBIS tool for bias risk assessment. Currently, no automated tool exists for this purpose. This project aims to build an AI tool to assess the quality and biases in SRs by: (a) building a labelled dataset of 1000 SRs that are quality/bias assessed to train the AI model by crowdsourcing to recruit volunteer collaborators; (b) developing the code for the model and testing its performance; (c) building a user interface (website) to house the tool.
Methods: We posted a request for collaborators on Cochrane Engage, a crowdsourcing website, on May 24, 2023, asking for volunteer collaborators who have experience with SRs, critical thinking, and problem-solving skills. Respondents were sent training materials and study instructions as a first step to do remote self-training. As collaborators began to extract and assess SRs, feedback was given on each item that was not assessed correctly until their quality reached 100%. For the AI modelling, we first use Dense Passage Retrieval (DPR) to identify the relevant passages from the PDFs based on the cosine similarity scores between the passages and the question (ROBIS/AMSTAR 2 item). A transformer model will be trained (S-BERT) to rank passages, which we will then fine-tune using our dataset.
Results: To date, we have recruited 56 crowdsourced assessors, and 25 collaborators are actively working on assessments. In total, 486/1000 SRs have been completed: 366 assessments are completed and checked, 98 are pending checking, and 22 assessments are in progress. Two data scientists conducted the data-cleaning coding and DPR and wrote the model code. A website to house the tool was built.
Conclusion: In conclusion, crowdsourcing is an effective strategy to build a large, complex, and balanced dataset. We present the first open-access and free AI tool to critically assess SRs.