RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current chest X-ray (CXR) interpretation methods suffer from three key limitations: weak clinical interpretability, insufficient multimodal (vision–text) evidence fusion, and inconsistent tool outputs lacking dynamic validation mechanisms. To address these, we propose a clinical-prior-guided multi-agent framework that emulates radiologists’ diagnostic workflow, enabling vision-grounded collaborative reasoning. Specialized agents perform image analysis, report generation, consistency verification, and external knowledge retrieval; cross-modal evidence integration is achieved via multimodal fusion and vision–language alignment. We further introduce a dynamic conflict-resolution mechanism and retrieval-augmented contextual verification to enhance robustness and reliability. Experiments demonstrate significant improvements in diagnostic accuracy (+4.2%) and report consistency (+18.7%), yielding structured reports that are more transparent, clinically aligned, and compliant with established guidelines.

Technology Category

Application Category

📝 Abstract
Agentic systems offer a potential path to solve complex clinical tasks through collaboration among specialized agents, augmented by tool use and external knowledge bases. Nevertheless, for chest X-ray (CXR) interpretation, prevailing methods remain limited: (i) reasoning is frequently neither clinically interpretable nor aligned with guidelines, reflecting mere aggregation of tool outputs; (ii) multimodal evidence is insufficiently fused, yielding text-only rationales that are not visually grounded; and (iii) systems rarely detect or resolve cross-tool inconsistencies and provide no principled verification mechanisms. To bridge the above gaps, we present RadAgents, a multi-agent framework for CXR interpretation that couples clinical priors with task-aware multimodal reasoning. In addition, we integrate grounding and multimodal retrieval-augmentation to verify and resolve context conflicts, resulting in outputs that are more reliable, transparent, and consistent with clinical practice.
Problem

Research questions and friction points this paper is trying to address.

Improve clinical interpretability and guideline alignment in CXR interpretation
Address insufficient multimodal evidence fusion in radiology workflows
Detect and resolve cross-tool inconsistencies with verification mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework with clinical priors
Multimodal reasoning with grounding retrieval
Conflict resolution via verification mechanisms
🔎 Similar Papers
No similar papers found.
K
Kai Zhang
Oracle Health AI
C
Corey D Barrett
Oracle Health AI
J
Jangwon Kim
Oracle Health AI
L
Lichao Sun
Lehigh University
T
Tara Taghavi
Oracle Health AI
Krishnaram Kenthapadi
Krishnaram Kenthapadi
Oracle Health AI
Fairness/Transparency/Explainability/Privacy in AI/ML SystemsAlgorithmsData MiningSocial