Continual Generalized Category Discovery: Learning and Forgetting from a Bayesian Perspective

📅 2025-07-23

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Continual Generalized Category Discovery (C-GCD) confronts the dual challenges of incrementally learning novel categories from unlabeled data streams while mitigating catastrophic forgetting of previously acquired knowledge. This work identifies covariance misalignment as the primary cause of forgetting and proposes VB-CGCD, a variational Bayesian framework for dynamic distribution alignment. VB-CGCD employs covariance-aware nearest-class-mean classification to generate robust pseudo-labels and integrates variational inference–driven class-distribution adaptation with noise suppression to resolve knowledge conflicts under mixed-category settings. Model parameters are optimized via a stochastic variational update mechanism. On standard benchmarks, VB-CGCD achieves a 15.21% improvement in final-session accuracy. Under an extremely low-label regime—where only 10% of new-class samples are annotated—it attains 67.86% accuracy, significantly surpassing the state-of-the-art (38.55%).

Technology Category

Application Category

📝 Abstract

Continual Generalized Category Discovery (C-GCD) faces a critical challenge: incrementally learning new classes from unlabeled data streams while preserving knowledge of old classes. Existing methods struggle with catastrophic forgetting, especially when unlabeled data mixes known and novel categories. We address this by analyzing C-GCD's forgetting dynamics through a Bayesian lens, revealing that covariance misalignment between old and new classes drives performance degradation. Building on this insight, we propose Variational Bayes C-GCD (VB-CGCD), a novel framework that integrates variational inference with covariance-aware nearest-class-mean classification. VB-CGCD adaptively aligns class distributions while suppressing pseudo-label noise via stochastic variational updates. Experiments show VB-CGCD surpasses prior art by +15.21% with the overall accuracy in the final session on standard benchmarks. We also introduce a new challenging benchmark with only 10% labeled data and extended online phases, VB-CGCD achieves a 67.86% final accuracy, significantly higher than state-of-the-art (38.55%), demonstrating its robust applicability across diverse scenarios. Code is available at: https://github.com/daihao42/VB-CGCD

Problem

Research questions and friction points this paper is trying to address.

Incremental learning of new classes from unlabeled data streams

Preserving knowledge of old classes to avoid catastrophic forgetting

Handling mixed unlabeled data with known and novel categories

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian analysis of forgetting dynamics

Variational inference with covariance alignment

Stochastic updates suppress pseudo-label noise

🔎 Similar Papers

No similar papers found.