MedVQA-TREE: A Multimodal Reasoning and Retrieval Framework for Sarcopenia Prediction

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Sarcopenia ultrasound diagnosis faces three key challenges: subtle imaging features, scarcity of annotated data, and lack of clinical context. To address these, we propose an interpretable diagnostic framework integrating multimodal reasoning and knowledge enhancement. First, a hierarchical visual understanding model is developed, combining anatomy-aware region segmentation with graph-structured spatial reasoning. Second, a gated feature fusion mechanism dynamically integrates imaging features with clinical semantic representations. Third, UMLS-guided multi-hop, multi-query retrieval jointly accesses PubMed and a domain-specific sarcopenia knowledge base to inject external clinical knowledge. Evaluated on both public and in-house datasets, our method achieves 99% diagnostic accuracy—surpassing state-of-the-art methods by over 10%—while significantly improving interpretability and clinical adaptability through transparent, knowledge-grounded decision pathways.

Technology Category

Application Category

📝 Abstract
Accurate sarcopenia diagnosis via ultrasound remains challenging due to subtle imaging cues, limited labeled data, and the absence of clinical context in most models. We propose MedVQA-TREE, a multimodal framework that integrates a hierarchical image interpretation module, a gated feature-level fusion mechanism, and a novel multi-hop, multi-query retrieval strategy. The vision module includes anatomical classification, region segmentation, and graph-based spatial reasoning to capture coarse, mid-level, and fine-grained structures. A gated fusion mechanism selectively integrates visual features with textual queries, while clinical knowledge is retrieved through a UMLS-guided pipeline accessing PubMed and a sarcopenia-specific external knowledge base. MedVQA-TREE was trained and evaluated on two public MedVQA datasets (VQA-RAD and PathVQA) and a custom sarcopenia ultrasound dataset. The model achieved up to 99% diagnostic accuracy and outperformed previous state-of-the-art methods by over 10%. These results underscore the benefit of combining structured visual understanding with guided knowledge retrieval for effective AI-assisted diagnosis in sarcopenia.
Problem

Research questions and friction points this paper is trying to address.

Accurate sarcopenia diagnosis via ultrasound remains challenging
Integrates hierarchical image interpretation with clinical context
Combines visual understanding with guided knowledge retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical image interpretation module
Gated feature-level fusion mechanism
Multi-hop multi-query retrieval strategy
🔎 Similar Papers
2024-05-27International Conference on Information and Knowledge ManagementCitations: 4
P
Pardis Moradbeiki
SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
Nasser Ghadiri
Nasser Ghadiri
Ingham Institute for Applied Medical Research | IUT
Artificial IntelligenceMachine LearningNatural Language ProcessingData Fusion
Sayed Jalal Zahabi
Sayed Jalal Zahabi
Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
Uffe Kock Wiil
Uffe Kock Wiil
Professor, University of Southern Denmark
health informaticssecurity informaticssocial network analysis and mininghypermediadata-driven health technology
K
Kristoffer Kittelmann Brockhattingen
Geriatric Research Unit, Department of Clinical Research, University of Southern Denmark, Odense, Denmark
A
Ali Ebrahimi
SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark