Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study

Reichenpfader, Daniel; Rösslhuemer, Philipp; Denecke, Kerstin (8 May 2024). Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study In: dHealth 2024: Proceedings of the 18th Health Informatics Meets Digital Health Conference. Studies in Health Technology and Informatics: Vol. 313 (pp. 22-27). Amsterdam: IOS Press 10.3233/SHTI240006

[img]
Preview
Text
SHTI-313-SHTI240006.pdf - Published Version
Available under License Creative Commons: Attribution-Noncommercial (CC-BY-NC).

Download (483kB) | Preview

Background: Healthcare systems are increasingly resource constrained, leaving less time for important patient-provider interactions. Conversational agents (CAs) could be used to support the provision of information and to answer patients’ questions. However, information must be accessible to a variety of patient populations, which requires understanding questions expressed at different language levels. Methods: This study describes the use of Large Language Models (LLMs) to evaluate predefined medical content in CAs across patient populations. These simulated populations are characterized by a range of health literacy. The evaluation framework includes both fully automated and semi-automated procedures to assess the performance of a CA. Results: A case study in the domain of mammography shows that LLMs can simulate questions from different patient populations. However, the accuracy of the answers provided varies depending on the level of health literacy. Conclusions: Our scalable evaluation framework enables the simulation of patient populations with different health literacy levels and helps to evaluate domain specific CAs, thus promoting their integration into clinical practice. Future research aims to extend the framework to CAs without predefined content and to apply LLMs to adapt medical information to the specific (health) literacy level of the user.

Item Type:

Conference or Workshop Item (Paper)

Division/Institute:

School of Engineering and Computer Science > Institute for Patient-centered Digital Health
School of Engineering and Computer Science > Institute for Patient-centered Digital Health > AI for Health
School of Engineering and Computer Science

Name:

Reichenpfader, Daniel0000-0002-8052-3359;
Rösslhuemer, Philipp and
Denecke, Kerstin0000-0001-6691-396X

Subjects:

Q Science > QA Mathematics > QA76 Computer software
R Medicine > R Medicine (General)
R Medicine > RA Public aspects of medicine > RA0421 Public health. Hygiene. Preventive Medicine

ISBN:

978-1-64368-516-8

Series:

Studies in Health Technology and Informatics

Publisher:

IOS Press

Language:

English

Submitter:

Daniel Reichenpfader

Date Deposited:

03 May 2024 11:49

Last Modified:

13 May 2024 11:21

Publisher DOI:

10.3233/SHTI240006

Related URLs:

Uncontrolled Keywords:

Natural Language Processing, Consumer Health Information, Algorithms, Conversational Agents, Large Language Model

ARBOR DOI:

10.24451/arbor.21853

URI:

https://arbor.bfh.ch/id/eprint/21853

Actions (login required)

View Item View Item
Provide Feedback