Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

📅 2025-03-18

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the lack of systematic investigation into weight imprinting for downstream adaptation of foundation models. We propose the first unified framework comprising three stages: proxy generation, feature normalization, and weighted aggregation. Theoretically, we establish a novel connection between weight imprinting and neural collapse. Methodologically, we introduce a clustering-driven multi-proxy generation strategy to enhance robustness in novel-class recognition. We further demonstrate that feature normalization is critical for both training stability and generalization. Evaluated on challenging novel-class recognition tasks under complex data distributions, our approach achieves up to a 4% absolute accuracy improvement over prior methods. It significantly strengthens zero-shot and few-shot adaptation capabilities to unseen classes, enabling more reliable and scalable deployment of foundation models in open-world settings.

Technology Category

Application Category

📝 Abstract

The capacity of a foundation model allows for adaptation to new downstream tasks. Weight imprinting is a universal and efficient method to fulfill this purpose. It has been reinvented several times, but it has not been systematically studied. In this paper, we propose a framework for imprinting, identifying three main components: generation, normalization, and aggregation. This allows us to conduct an in-depth analysis of imprinting and a comparison of the existing work. We reveal the benefits of representing novel data with multiple proxies in the generation step and show the importance of proper normalization. We determine those proxies through clustering and propose a novel variant of imprinting that outperforms previous work. We motivate this by the neural collapse phenomenon -- an important connection that we can draw for the first time. Our results show an increase of up to 4% in challenging scenarios with complex data distributions for new classes.

Problem

Research questions and friction points this paper is trying to address.

Systematically study weight imprinting for model adaptation.

Analyze generation, normalization, and aggregation in imprinting.

Improve performance in complex data distribution scenarios.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Weight imprinting framework with three components

Multiple proxies enhance novel data representation

Clustering determines proxies for improved imprinting

🔎 Similar Papers

No similar papers found.