MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities

📅 2024-04-20
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of heterogeneous knowledge transfer—i.e., effective knowledge propagation across models with incompatible architectures, tasks, and data modalities. We propose a general-purpose transfer framework that requires neither structural alignment nor label-space compatibility. Our core innovation is a parameter-space bridging mechanism: a learnable low-rank adapter dynamically extracts knowledge from the source model’s training trajectory and adaptively maps it into the target model’s parameter space. By integrating low-rank parameter querying with end-to-end differentiable joint training, the framework enables seamless cross-modal, cross-task, and cross-architecture optimization. Experiments across multiple highly heterogeneous transfer benchmarks demonstrate substantial improvements over state-of-the-art methods. Notably, our approach remains robust and efficient even in extreme heterogeneity scenarios where conventional approaches fail completely.

Technology Category

Application Category

📝 Abstract
In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models, facilitating the direct interaction, extraction, and application of knowledge within these parameter spaces. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters and adeptly learning to identify and map parameters into the target model. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage, including the training trajectory knowledge of the source model. Extensive experiments on heterogeneous knowledge transfer demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.
Problem

Research questions and friction points this paper is trying to address.

Cross-model Transfer Learning
Task-agnostic Knowledge Transfer
Diverse Data Types
Innovation

Methods, ideas, or system contributions that make the work stand out.

MergeNet
Cross-model Knowledge Transfer
Adaptive Learning
🔎 Similar Papers
No similar papers found.
K
Kunxi Li
Zhejiang University
T
Tianyu Zhan
Zhejiang University
S
Shengyu Zhang
Zhejiang University
Kun Kuang
Kun Kuang
Zhejiang University
Causal InferenceData MiningMachine Learning
J
Jiwei Li
Zhejiang University
Zhou Zhao
Zhou Zhao
Zhejiang University
Machine LearningData MiningMultimedia Computing
F
Fei Wu
Zhejiang University