The present and future use of text mining for study identification

Article type
Thomas J1, Ananiadou S2, O'Mara-Eves A3, Kontonatsios G2, Stansfield C4
1EPPI-Centre, UCL (CQIMG, CPHG, #CochraneTech)
2School of Computer Science, University of Manchester
3EPPI-Centre, UCL Institute of Education
4EPPI-Centre, SSRU, UCL Institute of Education
Objectives: To discuss:
1. Current practice, and the current evidence base, about how text-mining technologies are used to identify studies in systematic reviews.
2. How newly emerging technologies might bring about even more radical changes to practice.
3. The methodological issues around the use of these technologies for study identification in systematic reviews.
Description: The large and growing number of publications makes identifying relevant studies both complex and time consuming. Text-mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved.
We will summarise the current evidence base underpinning text mining for study selection in systematic reviews; we will also cover new applications of these technologies including the ‘Evidence Pipeline’ in the Transform project.
We will then cover more advanced techniques, combining text mining with machine learning, including term-based clustering and ‘topic modelling’ where the traditional lines which distinguish searching, screening and mapping become blurred, and one ‘task’ is able simultaneously to identify, screen and describe research activity.
Small groups (with whole-group feedback) will then try out certain tools and discuss the methodological issues that the use of these technologies raise; this will shape the content of emerging guidance on their use.
Participants are encouraged to bring laptops to try out online resources.