M3D-BFS: a Multi-stage Dynamic Fusion Strategy for Sample-Adaptive Multi-Modal Brain Network Analysis

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing static fusion strategies in multimodal brain network analysis, which fail to adaptively adjust the fusion process according to input samples, thereby constraining model performance. To overcome this, the study introduces a dynamic fusion mechanism into multimodal brain network modeling for the first time, proposing a multi-stage dynamic fusion framework. This framework employs Mixture-of-Experts (MoE) modules tailored separately for unimodal and multimodal representations, enabling input-driven adaptive fusion during inference. Furthermore, a three-stage progressive training strategy combined with a multimodal disentanglement loss is incorporated to effectively mitigate expert collapse. Extensive experiments on multiple real-world brain network datasets demonstrate that the proposed method significantly outperforms current static fusion approaches, confirming its effectiveness and superiority.
📝 Abstract
Multi-modal fusion is of great significance in neuroscience which integrates information from different modalities and can achieve better performance than uni-modal methods in downstream tasks. Current multi-modal fusion methods in brain networks, which mainly focus on structural connectivity (SC) and functional connectivity (FC) modalities, are static in nature. They feed different samples into the same model with identical computation, ignoring inherent difference between input samples. This lack of sample adaptation hinders model's further performance. To this end, we innovatively propose a multi-stage dynamic fusion strategy (M3D-BFS) for sample-adaptive multi-modal brain network analysis. Unlike other static fusion methods, we design different mixture-of-experts (MoEs) for uni- and multi-modal representations where modules can adaptively change as input sample changes during inference. To alleviate issue of MoE where training of experts may be collapsed, we divide our method into 3 stages. We first train uni-modal encoders respectively, then pretrain single experts of MoEs before finally finetuning the whole model. A multi-modal disentanglement loss is designed to enhance the final representations. To the best of our knowledge, this is the first work for dynamic fusion for multi-modal brain network analysis. Extensive experiments on different real-world datasets demonstrates the superiority of M3D-BFS.
Problem

Research questions and friction points this paper is trying to address.

multi-modal fusion
sample adaptation
brain network analysis
dynamic fusion
structural and functional connectivity
Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic fusion
sample-adaptive
mixture-of-experts
multi-modal brain network
multi-stage training
🔎 Similar Papers
No similar papers found.
Rui Dong
Rui Dong
Ph.D. candidate, University of Michigan
program synthesisformal methodsprogram verification
X
Xiaotong Zhang
School of Computer Science and Engineering, Southeast University
J
Jiaxing Li
School of Computer Science and Engineering, Southeast University
Y
Yueying Li
School of Computer Science and Engineering, Southeast University
J
Jiayin Wei
School of Computer Science and Engineering, Southeast University
Youyong Kong
Youyong Kong
Associate Professor at School of Computer Science and Engineering, Southeast University
medical image processingmachine learningbrain network analysis