MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities

📅 2024-04-20

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the challenge of heterogeneous knowledge transfer—i.e., effective knowledge propagation across models with incompatible architectures, tasks, and data modalities. We propose a general-purpose transfer framework that requires neither structural alignment nor label-space compatibility. Our core innovation is a parameter-space bridging mechanism: a learnable low-rank adapter dynamically extracts knowledge from the source model’s training trajectory and adaptively maps it into the target model’s parameter space. By integrating low-rank parameter querying with end-to-end differentiable joint training, the framework enables seamless cross-modal, cross-task, and cross-architecture optimization. Experiments across multiple highly heterogeneous transfer benchmarks demonstrate substantial improvements over state-of-the-art methods. Notably, our approach remains robust and efficient even in extreme heterogeneity scenarios where conventional approaches fail completely.

Technology Category

Application Category

📝 Abstract

In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models, facilitating the direct interaction, extraction, and application of knowledge within these parameter spaces. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters and adeptly learning to identify and map parameters into the target model. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage, including the training trajectory knowledge of the source model. Extensive experiments on heterogeneous knowledge transfer demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.

Problem

Research questions and friction points this paper is trying to address.

Cross-model Transfer Learning

Task-agnostic Knowledge Transfer

Diverse Data Types

Innovation

Methods, ideas, or system contributions that make the work stand out.

MergeNet

Cross-model Knowledge Transfer

Adaptive Learning

🔎 Similar Papers

No similar papers found.

Authors to Follow