PRISM: Synergizing Vision Foundation Models via Self-organized Expert Specialization

📅 2026-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
This work addresses the challenge of efficiently integrating the complementary capabilities of multiple vision foundation models (VFMs) while mitigating negative transfer and feature interference. The authors propose PRISM, a dual-stream mixture-of-experts (MoE) framework that employs a two-stage paradigm: first, experts self-organize and specialize within decoupled representation subspaces; subsequently, they are dynamically recombined to construct sparse, task-specific computational pathways for downstream tasks. PRISM introduces, for the first time, a self-organized expert specialization mechanism that operates without manual intervention, synergistically combining teacher-conditioned routing with dynamic routing strategies to substantially alleviate interference in multi-model fusion. Experiments demonstrate that PRISM achieves new state-of-the-art performance on PASCAL-Context and NYUD-v2, underscoring the scalability and effectiveness of sparse, specialized pathways in harnessing diverse visual knowledge.
📝 Abstract
Unifying the complementary strengths of diverse Vision Foundation Models (VFMs) into a single efficient model is highly desirable but challenged by the negative transfer inherent in monolithic distillation. To address these feature conflicts, we introduce \textbf{PRISM}, a novel dual-stream Mixture-of-Experts (MoE) framework that synergizes VFMs via modular specialization. We propose a two-stage paradigm: (1) expertise deconstruction, where a teacher-conditional router guides experts to specialize in distinct representational subspaces to mitigate interference, followed by (2) dynamic recomposition, where the router learns to assemble these experts into tailored computational pathways for downstream tasks. Experiments on PASCAL-Context and NYUD-v2 show that \textbf{PRISM} establishes a new state of the art, validating that sparse, emergent specialization is a scalable approach for integrating diverse visual knowledge.
Problem

Research questions and friction points this paper is trying to address.

Vision Foundation Models
negative transfer
feature conflicts
model unification
Mixture-of-Experts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts
Vision Foundation Models
Expert Specialization
Negative Transfer Mitigation
Dynamic Routing
🔎 Similar Papers
No similar papers found.