Set Pivot Learning: Redefining Generalized Segmentation with Vision Foundation Models

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional domain generalization (DG) assumes test domains are entirely unseen during training—a premise increasingly unrealistic with the emergence of vision foundation models (VFMs). This work redefines the DG paradigm for the VFM era by proposing **Set Pivot Learning (SPL)**, a task-driven, dynamic class-aware prompting framework. SPL jointly designs a dynamic prompt tuner and a VFM-centric feature refinement module to enable continual domain alignment under generalized segmentation. Crucially, SPL introduces *set-level semantic pivots*—dynamic, category-aware anchor representations derived from class-wise feature sets—into VFM adaptation, overcoming the limitations of static feature transfer. Evaluated on multiple DG segmentation benchmarks, SPL consistently outperforms state-of-the-art methods, demonstrating superior cross-domain generalization and robustness, especially under large domain shifts and diverse unseen domains.

Technology Category

Application Category

📝 Abstract
In this paper, we introduce, for the first time, the concept of Set Pivot Learning, a paradigm shift that redefines domain generalization (DG) based on Vision Foundation Models (VFMs). Traditional DG assumes that the target domain is inaccessible during training, but the emergence of VFMs, trained on vast and diverse data, renders this assumption unclear and obsolete. Traditional DG assumes that the target domain is inaccessible during training, but the emergence of VFMs, which are trained on vast and diverse datasets, renders this assumption unclear and obsolete. To address this challenge, we propose Set Pivot Learning (SPL), a new definition of domain migration task based on VFMs, which is more suitable for current research and application requirements. Unlike conventional DG methods, SPL prioritizes adaptive refinement over rigid domain transfer, ensuring continuous alignment with evolving real-world conditions. Specifically, SPL features two key attributes: (i) Dynamic adaptation, transitioning from static domain alignment to flexible, task-driven feature optimization, enabling models to evolve with downstream scenarios; (ii) VFM-centric tuning, leveraging pretrained knowledge as a pivot to hone task-specific representations while preserving cross-domain robustness. Building on SPL, we propose a Dynamic Prompt Fine-Tuning method, which combines a Dynamic Class-aware Prompter with a Prompt-guided Feature Focuser, to elevate VFM performance in targeted scenarios. Extensive experiments on benchmark datasets show the effectiveness of our method, highlighting its superiority over state-of-the-art methods, particularly in generalized segmentation.
Problem

Research questions and friction points this paper is trying to address.

Redefining domain generalization using Vision Foundation Models
Proposing Set Pivot Learning for adaptive refinement
Enhancing VFM performance with dynamic prompt fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Set Pivot Learning redefines domain generalization
Dynamic adaptation for task-driven feature optimization
VFM-centric tuning preserves cross-domain robustness
🔎 Similar Papers
No similar papers found.
X
Xinhui Li
College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
Xinyu He
Xinyu He
East China Normal University
Qiming Hu
Qiming Hu
PPPL
tokamak
Xiaojie Guo
Xiaojie Guo
IBM TJ Watson Research Center
deep graph learningdata mining