Evaluating Large Language Models for Analysing Safety Risks in Healthcare Incident Reports
Version
Published
Date Issued
2025-08-07
Author(s)
Type
Article
Language
English
Abstract
Incident reports provide a rich source for analysing safety risks in healthcare systems. To support the timely analysis and interpretation of incident reports, natural language processing (NLP) can be applied. The aim of this paper is to evaluate the potential of large language models (LLMs) in extracting the causes of incidents and identifying contributing factors from incident reports. As dataset, we considered 10,063 messages from CIRRNET®, the Swiss national database for critical incidents in healthcare. We applied the LLM Gemma-2 to extract events, causes and contributing factors and group them along themes. 100 event reports were assessed manually regarding quality of extraction. Events were extracted with 92% accuracy, causes with 84% and contributing factors with 72% accuracy. Extraction of contributing factors fails as the LLM hallucinates or interprets. We conclude that LLMs show potential in analysing incident reports and can improve the efficiency and consistency of incident analysis.
Publisher DOI
Journal or Serie
Studies in health technology and informatics
Journal or Serie
MEDINFO 2025: Healthcare Smart × Medicine Deep
Series/Report No.
Studies in Health Technology and Informatics
ISSN
1879-8365
Publisher URL
Volume
329
Publisher
IOS Press
Submitter
Denecke, Kerstin
Citation apa
Denecke, K., & Paula, H. (2025). Evaluating Large Language Models for Analysing Safety Risks in Healthcare Incident Reports. In MEDINFO 2025: Healthcare Smart × Medicine Deep (Vol. 329, pp. 386–390). IOS Press. https://doi.org/10.24451/dspace/12099
File(s)![Thumbnail Image]()
Loading...
open access
Name
SHTI-329-SHTI250867.pdf
License
Attribution-NonCommercial 4.0 International
Version
published
Size
586.67 KB
Format
Adobe PDF
Checksum (MD5)
29bd03451236f0fc964ad7f3a316ee74
