🤖 AI Summary
To address weak generalization and rigid modality coupling in multimodal data fusion, this paper proposes Meta Fusion—a unified, model-agnostic multimodal fusion framework. It integrates early-, middle-, and late-fusion paradigms, enabling collaborative training of cross-modal representation ensembles via deep mutual learning. Additionally, it introduces a soft information-sharing mechanism based on latent-variable modeling to achieve differentiable and interpretable dynamic knowledge transfer across modalities. Theoretical analysis demonstrates that Meta Fusion effectively reduces generalization error. Empirical evaluation on Alzheimer’s disease early detection, neural decoding, and multi-omics simulation tasks shows that Meta Fusion consistently outperforms state-of-the-art fusion methods, validating its superior generalizability and robustness across diverse domains and data modalities.
📝 Abstract
Developing effective multimodal data fusion strategies has become increasingly essential for improving the predictive power of statistical machine learning methods across a wide range of applications, from autonomous driving to medical diagnosis. Traditional fusion methods, including early, intermediate, and late fusion, integrate data at different stages, each offering distinct advantages and limitations. In this paper, we introduce Meta Fusion, a flexible and principled framework that unifies these existing strategies as special cases. Motivated by deep mutual learning and ensemble learning, Meta Fusion constructs a cohort of models based on various combinations of latent representations across modalities, and further boosts predictive performance through soft information sharing within the cohort. Our approach is model-agnostic in learning the latent representations, allowing it to flexibly adapt to the unique characteristics of each modality. Theoretically, our soft information sharing mechanism reduces the generalization error. Empirically, Meta Fusion consistently outperforms conventional fusion strategies in extensive simulation studies. We further validate our approach on real-world applications, including Alzheimer's disease detection and neural decoding.