🤖 AI Summary
Federated continual learning (FCL) confronts two intertwined challenges: exacerbated catastrophic forgetting under data heterogeneity, communication constraints, and privacy requirements; and inference difficulty due to unknown task identities. To address these, we propose a replay-free gradient orthogonal projection framework. On each client, model update gradients are orthogonally projected onto the complement of the historical task representation subspace—thereby mitigating cross-task interference and preserving prior knowledge. Concurrently, we introduce a lightweight core-basis-driven task identity prediction module that enables adaptive inference without prior task identifiers. The approach ensures low communication overhead and strong privacy preservation. Evaluated on multiple standard FCL benchmarks, our method consistently outperforms existing state-of-the-art approaches, achieving significant average accuracy gains—particularly in the challenging task-identity-unknown setting.
📝 Abstract
Federated continual learning (FCL) enables distributed client devices to learn from streaming data across diverse and evolving tasks. A major challenge to continual learning, catastrophic forgetting, is exacerbated in decentralized settings by the data heterogeneity, constrained communication and privacy concerns. We propose Federated gradient Projection-based Continual Learning with Task Identity Prediction (FedProTIP), a novel FCL framework that mitigates forgetting by projecting client updates onto the orthogonal complement of the subspace spanned by previously learned representations of the global model. This projection reduces interference with earlier tasks and preserves performance across the task sequence. To further address the challenge of task-agnostic inference, we incorporate a lightweight mechanism that leverages core bases from prior tasks to predict task identity and dynamically adjust the global model's outputs. Extensive experiments across standard FCL benchmarks demonstrate that FedProTIP significantly outperforms state-of-the-art methods in average accuracy, particularly in settings where task identities are a priori unknown.