SafeLLM: Extraction as a Hallucination-Resistant Alternative to Rewriting in Safety-Critical Settings

📅 2026-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of traditional rewrite-based retrieval-augmented generation (RAG) systems in safety- and compliance-critical settings, where hallucinations and trade-offs between answer completeness and conciseness are prevalent. To enhance faithfulness and controllability, the authors propose replacing generative rewriting with extractive methods, introducing a novel line-number-guided source selection mechanism combined with safety-annotated sentence extraction and a multi-stage evidence refinement pipeline to precisely retrieve information from organizational documents. Experiments on NHS and NICE clinical guidelines demonstrate that the approach achieves over 95% term recall, substantially outperforming baseline methods—particularly on protocol-style documents. While the multi-stage filtering improves precision, it may introduce systematic omissions. This study thus offers a more reliable and interpretable RAG paradigm for high-stakes domains.
📝 Abstract
Large language models (LLMs) are increasingly used to access organisational documentation, including standard operating procedures (SOPs), HR policies and institutional guidelines. However, retrieval-augmented generation (RAG) systems that rely on free-form rewriting can introduce hallucinations and unstable trade-offs between completeness and conciseness, particularly in safety- and compliance-critical settings. Objectives: To evaluate extraction as a hallucination-resistant alternative to rewriting-based RAG and compare strategies that balance precision, recall and safety across document types and model scales. Methods: We compare multiple prompting strategies, including line-number-based source selection, extraction of relevant guideline sentences with explicit safety annotations, and a multi-stage pipeline that refines draft answers using supporting evidence from source guidelines. Experiments are conducted on documents of varying length and structure, including local NHS acute care and oncology guidelines and UK-wide NICE guidelines, using both frontier-scale and locally deployable models. Performance is assessed using automatic metrics and human expert evaluation of relevance and completeness. Results: Line-number selection achieves the strongest results, outperforming direct copying and safety-focused strategies across both large and small models while maintaining high term recall (up to 95%) and close alignment with source text. Safety-oriented approaches improve precision but introduce systematic omissions, while multi-stage filtering further amplifies this trade-off. Performance varies with document structure: line-based extraction excels in protocol-like content, whereas alternative strategies perform better on more verbose documents (up to 97% term recall).
Problem

Research questions and friction points this paper is trying to address.

hallucination
retrieval-augmented generation
safety-critical
compliance
document extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

extraction
hallucination-resistant
retrieval-augmented generation
line-number selection
safety-critical
Julia Ive
Julia Ive
University College London
F
Felix Jozsa
Institute of Health Informatics, University College London, London, UK
E
Evridiki Georgaki
Institute of Health Informatics, University College London, London, UK
N
Nabeel Sheikh
Somerset NHS Foundation Trust, UK
E
Emma Cattell
Somerset NHS Foundation Trust, UK
N
Nick Jackson
King’s College Hospital, Denmark Hill, London, UK
Paulina Bondaronek
Paulina Bondaronek
University College London
digital healthevaluationnatural language processing
C
Ciaran Scott Hill
National Hospital for Neurology and Neurosurgery, Queen Square, London, UK
R
Richard Dobson
Institute of Health Informatics, University College London, London, UK