MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents

๐Ÿ“… 2025-12-17
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Medical documents suffer from fragmented evidence and poor cross-source consistency, hindering the efficiency of clinical decision support. To address this, we propose a query-driven framework for extracting trustworthy information snippets and performing dynamic clustering. Methodologically, we introduce a novel confidence-calibrated, multi-round large language model (LLM) extraction mechanism that integrates repeated sampling verification, semantic clustering, and confidence-weighted fusionโ€”enabling traceable alignment and aggregation of evidence across clinical guidelines and scholarly literature. Evaluated in the context of antibiotic prophylaxis for prostate biopsy, domain experts assessed snippet extraction accuracy at 92.3%, demonstrating substantial improvements in locating, verifying, and integrating evidence-based statements within lengthy documents. This work establishes an interpretable, reproducible, and robust paradigm for structured evidence generation in clinical decision support systems.

Technology Category

Application Category

๐Ÿ“ Abstract
We present MedNuggetizer, https://mednugget-ai.de/; access is available upon request.}, a tool for query-driven extraction and clustering of information nuggets from medical documents to support clinicians in exploring underlying medical evidence. Backed by a large language model (LLM), extit{MedNuggetizer} performs repeated extractions of information nuggets that are then grouped to generate reliable evidence within and across multiple documents. We demonstrate its utility on the clinical use case of extit{antibiotic prophylaxis before prostate biopsy} by using major urological guidelines and recent PubMed studies as sources of information. Evaluation by domain experts shows that extit{MedNuggetizer} provides clinicians and researchers with an efficient way to explore long documents and easily extract reliable, query-focused medical evidence.
Problem

Research questions and friction points this paper is trying to address.

Extracts and clusters medical information nuggets from documents
Supports clinicians in exploring evidence for medical queries
Uses LLM to generate reliable evidence across multiple sources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM for repeated extraction of information nuggets
Clusters nuggets to generate reliable cross-document evidence
Query-driven tool for exploring medical documents efficiently
๐Ÿ”Ž Similar Papers
No similar papers found.
Gregor Donabauer
Gregor Donabauer
Information Science, University of Regensburg
Natural Language ProcessingMachine LearningInformation Retrieval
S
Samy Ateia
Information Science, University of Regensburg, Regensburg, Germany
Udo Kruschwitz
Udo Kruschwitz
Professor, University of Regensburg
Information RetrievalNatural Language EngineeringNatural Language Processing
M
Maximilian Burger
Department of Urology, St. Josef Medical Center, Regensburg, Germany
M
Matthias May
Department of Urology, St. Elisabeth Hospital Straubing, Straubing, Germany
C
Christian Gilfrich
Department of Urology, St. Elisabeth Hospital Straubing, Straubing, Germany
M
Maximilian Haas
Department of Urology, St. Josef Medical Center, Regensburg, Germany
J
Julio Ruben Rodas Garzaro
Department of Urology, St. Elisabeth Hospital Straubing, Straubing, Germany
C
Christoph Eckl
Department of Urology, St. Josef Medical Center, Regensburg, Germany