Anatomical grounding pre-training for medical phrase grounding

📅 2025-02-23

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

Medical phrase grounding (MPG) suffers from limited generalizability and poor zero-shot capability due to the scarcity of high-quality annotated data. To address this, we propose Anatomical Grounding Pretraining (AGP)—the first anatomy-aware pretraining paradigm specifically designed for MPG. AGP leverages large-scale anatomical segmentation datasets (e.g., Chest ImaGenome) and employs contrastive learning to jointly model radiology reports and image-based anatomical regions, enabling fine-grained alignment between textual phrases and anatomical structures. Crucially, AGP requires no task-specific annotations for pretraining. On the MS-CXR benchmark, it achieves state-of-the-art zero-shot localization performance; after fine-tuning, it attains an mIoU of 61.2%, establishing a new SOTA. Our core contribution lies in formulating and realizing an anatomy-informed multimodal pretraining objective that effectively bridges linguistic descriptions with anatomical image space, significantly enhancing transferability and data efficiency in MPG.

Technology Category

Application Category

📝 Abstract

Medical Phrase Grounding (MPG) maps radiological findings described in medical reports to specific regions in medical images. The primary obstacle hindering progress in MPG is the scarcity of annotated data available for training and validation. We propose anatomical grounding as an in-domain pre-training task that aligns anatomical terms with corresponding regions in medical images, leveraging large-scale datasets such as Chest ImaGenome. Our empirical evaluation on MS-CXR demonstrates that anatomical grounding pre-training significantly improves performance in both a zero-shot learning and fine-tuning setting, outperforming state-of-the-art MPG models. Our fine-tuned model achieved state-of-the-art performance on MS-CXR with an mIoU of 61.2, demonstrating the effectiveness of anatomical grounding pre-training for MPG.

Problem

Research questions and friction points this paper is trying to address.

Improves medical phrase grounding accuracy

Addresses scarcity of annotated medical data

Leverages anatomical terms for image alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Anatomical grounding pre-training task

Leverages Chest ImaGenome dataset

Improves zero-shot and fine-tuning performance

🔎 Similar Papers

MedRG: Medical Report Grounding with Multi-modal Large Language Model