A comparison of duplicate detection automation tools: a head-to-head comparison study

Poster

2023 London

Clark J¹, Bateup S², Fulbright H³, Forbes C¹, Gruber S⁴, Hair K⁵, Qureshi R⁶, Stansfield C⁷, Steel P⁸, Thomas J⁷

¹Institute for Evidence-Based Healthcare, Bond University

²Bond University Library, Bond University

³Centre for Reviews and Dissemination, University of York

⁴ PICO Portal

⁵CAMARADES, University of Edinburgh

⁶Anschutz Medical Campus, University of Colorado

⁷University College London

⁸Haskayne School of Business, University of Calgary

Background:
A key task when conducting a systematic review is to identify and remove duplicate records retrieved by a literature search across multiple databases, a process referred to as deduplication. Deduplication is time-consuming and error prone, particularly when processing thousands of references from multiple sources. Some approaches use automation combined with manual checking by humans and might be done using reference management software or bespoke deduplication tools that are available, either standalone or within systematic review software. Some are only accessible through expensive, proprietary software or operate in a “black box” environment. It is not known how these tools compare against each other and which performs best to minimise errors and reduce the time spent deduplicating.

Objectives:
We are evaluating how the eight bespoke deduplication tools perform to inform choices about which to use. We are evaluating the following tools: 1) Covidence; 2) EPPI-Reviewer; 3) the Deduplicator; 4) Rayyan; 5) PICO Portal; 6) Deduclick; 7) ASySD; and 8) HubMeta Deduplicator.

Methods:
Our sample set comprises re-run searches from a random selection of 10 Cochrane reviews published between 2020 and 2022. We will independently deduplicate these with two experienced information specialists to create 10 deduplicated gold standard sets. Each of the sets will be deduplicated using each tool under investigation and compared with the gold standard sets. The following outcomes will be measured:
1. Unique references removed
2. Duplicates missed
3. Additional duplicates identified
4. Time required to deduplicate
5. Qualitative analysis of unexpected findings of interest

Results:
Early testing suggests that the automation tools that comprise a human checking component produce fewer errors than those that are fully automated. The majority of these errors are missed duplicates, with few unique references removed. The tools that comprise a human checking component do require more time to deduplicate records sets. We expect to present the error rates of each tool and processing time.

Conclusions:
Our conclusions will be presented at the conference.

Patient, public and/or healthcare consumer involvement:
Although not directly relevant to patients, this study will help patients by contributing to methods that will result in more robust and efficient evidence production.