Article type
Abstract
"Background:
In interpersonal and human-computer interaction (HCI), natural language significantly enhances communication efficiency and expands interaction with computer systems. Notably, large language models (LLMs) such as OpenAI's GPT series and Google's Bard have demonstrated their powerful capabilities in sentiment analysis tasks, accurately identifying emotional tendencies—whether positive, negative, or neutral—through deep learning analysis of textual context.
Although the benefits of positive feedback, such as encouraging words, in promoting learning and performance improvement are widely recognized in educational and cognitive psychology, research on its impact on LLMs' performance, especially in evaluating the quality of clinical practice guidelines in healthcare, remains insufficient.
Objectives:
To explore whether the use of encouraging words as a form of positive feedback can improve the performance of ChatGPT-4 in evaluating the quality of clinical practice guidelines. It seeks to understand the potential impact of such feedback mechanisms on the optimization and performance of large language models, thereby enhancing the accuracy and efficiency of large language models in medical decision support systems.
Methods:
The research is based on an article published in the JAMA Network Open by Manuel M. Montero-Odasso et al., which evaluated 15 clinical practice guidelines using the Appraisal of Guidelines for Research and Evaluation (AGREE-II) Instrument. Building on this, we designed a pilot where the experimental group received prompts with encouraging words, while the control group received prompts with neutral words, to guide ChatGPT-4 in evaluating the guidelines according to the 23 items of the AGREE-II Instrument.
Using paired sample T-tests or Wilcoxon signed-rank tests, this study compared the differences in evaluation results for these 23 items between the experimental and control groups and the evaluation results of the guidelines in the original article, to quantify the effect of encouraging words.
Results:
Currently, the research is ongoing, and the results will be presented at the conference.
Conclusions:
Positive feedback through encouraging words may enhance ChatGPT-4's accuracy and efficiency in evaluating guideline quality, offering new insights into optimizing large language models for medical decision support systems."
In interpersonal and human-computer interaction (HCI), natural language significantly enhances communication efficiency and expands interaction with computer systems. Notably, large language models (LLMs) such as OpenAI's GPT series and Google's Bard have demonstrated their powerful capabilities in sentiment analysis tasks, accurately identifying emotional tendencies—whether positive, negative, or neutral—through deep learning analysis of textual context.
Although the benefits of positive feedback, such as encouraging words, in promoting learning and performance improvement are widely recognized in educational and cognitive psychology, research on its impact on LLMs' performance, especially in evaluating the quality of clinical practice guidelines in healthcare, remains insufficient.
Objectives:
To explore whether the use of encouraging words as a form of positive feedback can improve the performance of ChatGPT-4 in evaluating the quality of clinical practice guidelines. It seeks to understand the potential impact of such feedback mechanisms on the optimization and performance of large language models, thereby enhancing the accuracy and efficiency of large language models in medical decision support systems.
Methods:
The research is based on an article published in the JAMA Network Open by Manuel M. Montero-Odasso et al., which evaluated 15 clinical practice guidelines using the Appraisal of Guidelines for Research and Evaluation (AGREE-II) Instrument. Building on this, we designed a pilot where the experimental group received prompts with encouraging words, while the control group received prompts with neutral words, to guide ChatGPT-4 in evaluating the guidelines according to the 23 items of the AGREE-II Instrument.
Using paired sample T-tests or Wilcoxon signed-rank tests, this study compared the differences in evaluation results for these 23 items between the experimental and control groups and the evaluation results of the guidelines in the original article, to quantify the effect of encouraging words.
Results:
Currently, the research is ongoing, and the results will be presented at the conference.
Conclusions:
Positive feedback through encouraging words may enhance ChatGPT-4's accuracy and efficiency in evaluating guideline quality, offering new insights into optimizing large language models for medical decision support systems."