Article type
Abstract
Background
ChatGPT and other similar chatbots have been widely applied in the field of medical research. However, it is currently unclear how the use of such tools in medical research should be reported in a transparent and standardized manner.
Objectives
To develop a reporting guideline for applying large language model-based chatbots in medical research (CHEER) to promote the transparent reporting of ChatGPT and similar chatbots
Methods
We formed a multidisciplinary expert group, including members from various medical disciplines, inviting primarily corresponding authors who have published articles related to large language models (LLMs) in journals indexed in PubMed in selected established medical journals. We conducted a scoping review and surveyed stakeholders' attitudes, knowledge, and perspectives on the application of ChatGPT and similar chatbots in medical research to create an initial pool of items for the CHEER checklist. We used a modified Delphi method and virtual consensus meetings to reach agreement on the items and formulate the final CHEER checklist.
Results
One hundred experts from different medical disciplines were invited to participate as panelists, of whom 43 experts from 23 countries or regions completed the Delphi survey from 15 to 30 November 2023. After 2 rounds of Delphi survey and consensus meeting, the final version of CHEER comprising 10 items was formulated. The items address the name and version of the chatbot, its function and application context, verification of the generated content, and responsibility for the generated content. The checklist is accompanied by a document including detailed explanations and examples to help users understand and apply the tool.
Conclusions
CHEER is a tool designed to assist medical researchers in disclosing and reporting the use of chatbots when writing research papers. Although LLMs have a great potential to help medical research, their use also poses risks that can ultimately lead to harmful decisions in clinical practice. Adherence to CHEER will help evidence users to understand the role of LLMs better and evaluate their possible influence on the results. We hope the tool can also help journal editors, peer reviewers, and others to evaluate the scientific rigor and support the standardization of manuscripts.
ChatGPT and other similar chatbots have been widely applied in the field of medical research. However, it is currently unclear how the use of such tools in medical research should be reported in a transparent and standardized manner.
Objectives
To develop a reporting guideline for applying large language model-based chatbots in medical research (CHEER) to promote the transparent reporting of ChatGPT and similar chatbots
Methods
We formed a multidisciplinary expert group, including members from various medical disciplines, inviting primarily corresponding authors who have published articles related to large language models (LLMs) in journals indexed in PubMed in selected established medical journals. We conducted a scoping review and surveyed stakeholders' attitudes, knowledge, and perspectives on the application of ChatGPT and similar chatbots in medical research to create an initial pool of items for the CHEER checklist. We used a modified Delphi method and virtual consensus meetings to reach agreement on the items and formulate the final CHEER checklist.
Results
One hundred experts from different medical disciplines were invited to participate as panelists, of whom 43 experts from 23 countries or regions completed the Delphi survey from 15 to 30 November 2023. After 2 rounds of Delphi survey and consensus meeting, the final version of CHEER comprising 10 items was formulated. The items address the name and version of the chatbot, its function and application context, verification of the generated content, and responsibility for the generated content. The checklist is accompanied by a document including detailed explanations and examples to help users understand and apply the tool.
Conclusions
CHEER is a tool designed to assist medical researchers in disclosing and reporting the use of chatbots when writing research papers. Although LLMs have a great potential to help medical research, their use also poses risks that can ultimately lead to harmful decisions in clinical practice. Adherence to CHEER will help evidence users to understand the role of LLMs better and evaluate their possible influence on the results. We hope the tool can also help journal editors, peer reviewers, and others to evaluate the scientific rigor and support the standardization of manuscripts.