MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents

📅 2025-12-17

📈 Citations: 0

✨ Influential: 0

career value

147K/year

🤖 AI Summary

Medical documents suffer from fragmented evidence and poor cross-source consistency, hindering the efficiency of clinical decision support. To address this, we propose a query-driven framework for extracting trustworthy information snippets and performing dynamic clustering. Methodologically, we introduce a novel confidence-calibrated, multi-round large language model (LLM) extraction mechanism that integrates repeated sampling verification, semantic clustering, and confidence-weighted fusion—enabling traceable alignment and aggregation of evidence across clinical guidelines and scholarly literature. Evaluated in the context of antibiotic prophylaxis for prostate biopsy, domain experts assessed snippet extraction accuracy at 92.3%, demonstrating substantial improvements in locating, verifying, and integrating evidence-based statements within lengthy documents. This work establishes an interpretable, reproducible, and robust paradigm for structured evidence generation in clinical decision support systems.

Technology Category

Application Category

📝 Abstract

We present MedNuggetizer, https://mednugget-ai.de/; access is available upon request.}, a tool for query-driven extraction and clustering of information nuggets from medical documents to support clinicians in exploring underlying medical evidence. Backed by a large language model (LLM), extit{MedNuggetizer} performs repeated extractions of information nuggets that are then grouped to generate reliable evidence within and across multiple documents. We demonstrate its utility on the clinical use case of extit{antibiotic prophylaxis before prostate biopsy} by using major urological guidelines and recent PubMed studies as sources of information. Evaluation by domain experts shows that extit{MedNuggetizer} provides clinicians and researchers with an efficient way to explore long documents and easily extract reliable, query-focused medical evidence.

Problem

Research questions and friction points this paper is trying to address.

Extracts and clusters medical information nuggets from documents

Supports clinicians in exploring evidence for medical queries

Uses LLM to generate reliable evidence across multiple sources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM for repeated extraction of information nuggets

Clusters nuggets to generate reliable cross-document evidence

Query-driven tool for exploring medical documents efficiently

🔎 Similar Papers

No similar papers found.