Repository logo
  • English
  • Deutsch
  • Français
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. CRIS
  3. Publication
  4. Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study
 

Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study

URI
https://arbor.bfh.ch/handle/arbor/37296
Version
Published
Date Issued
2024-05-08
Author(s)
Reichenpfader, Daniel  
Rösslhuemer, Philipp
Denecke, Kerstin  
Type
Conference Paper
Language
English
Subjects

Natural Language Proc...

Consumer Health Infor...

Algorithms

Conversational Agents...

Large Language Model

Abstract
Background:
Healthcare systems are increasingly resource constrained, leaving less time for important patient-provider interactions. Conversational agents (CAs) could be used to support the provision of information and to answer patients’ questions. However, information must be accessible to a variety of patient populations, which requires understanding questions expressed at different language levels.
Methods:
This study describes the use of Large Language Models (LLMs) to evaluate predefined medical content in CAs across patient populations. These simulated populations are characterized by a range of health literacy. The evaluation framework includes both fully automated and semi-automated procedures to assess the performance of a CA.
Results:
A case study in the domain of mammography shows that LLMs can simulate questions from different patient populations. However, the accuracy of the answers provided varies depending on the level of health literacy.
Conclusions:
Our scalable evaluation framework enables the simulation of patient populations with different health literacy levels and helps to evaluate domain specific CAs, thus promoting their integration into clinical practice. Future research aims to extend the framework to CAs without predefined content and to apply LLMs to adapt medical information to the specific (health) literacy level of the user.
Subjects
QA76 Computer software
R Medicine (General)
RA0421 Public health. Hygiene. Preventive Medicine
ISBN
978-1-64368-516-8
DOI
10.24451/arbor.21853
https://doi.org/10.24451/arbor.21853
Publisher DOI
10.3233/SHTI240006
Series/Report No.
Studies in Health Technology and Informatics
Publisher URL
https://ebooks.iospress.nl/doi/10.3233/SHTI240006
Related URL
https://dhealth.at/ org
Organization
Institute for Patient-centered Digital Health  
AI for Health  
Technik und Informatik  
Volume
313
Conference
dHealth 2024: Proceedings of the 18th Health Informatics Meets Digital Health Conference
Publisher
IOS Press
Submitter
Reichenpfader, Daniel
Citation apa
Reichenpfader, D., Rösslhuemer, P., & Denecke, K. (2024). Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study (Vol. 313, pp. 22–27). IOS Press. https://doi.org/10.24451/arbor.21853
File(s)
Loading...
Thumbnail Image
Download

open access

Name

SHTI-313-SHTI240006.pdf

License
Attribution-NonCommercial 4.0 International
Version
published
Size

472.06 KB

Format

Adobe PDF

Checksum (MD5)

4ab6f9f417f0a6a54cd4764487cb08ba

About ARBOR

Built with DSpace-CRIS software - System hosted and mantained by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Our institution