A Real-World Evaluation of LLM Medication Safety Reviews in NHS Primary Care

Large Language Models (LLMs) show promise for medication safety in healthcare. This paper presents a real-world evaluation of an LLM-powered system for medication safety reviews in NHS Primary Care, identifying potential errors, drug-drug interactions, and adverse reactions from patient records. A retrospective study on anonymized NHS patient data revealed the LLM system achieved 100% sensitivity in detecting critical safety issues, but only correctly identified all issues and interventions in 46.9% of patients. Failure analysis indicated that contextual reasoning, rather than lack of medication knowledge, was the dominant failure mechanism, highlighting shortcomings that need addressing before safe clinical deployment.

A Real-World Evaluation of LLM Medication Safety Reviews in NHS Primary Care

Abstract

Projects