Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses key challenges in continual learning with large language models—namely catastrophic forgetting, limited transferability, and sensitivity to adversarial perturbations. The authors propose AdvCL, a novel approach that repurposes adversarial perturbations from a defensive tool into geometric control signals. AdvCL integrates three plug-and-play modules: Intra-Smooth enhances local smoothness of the loss landscape, Proto-Clip prevents over-alignment of current-task prototypes, and Inter-Align actively aligns historical-task prototypes to reduce representational divergence. This framework establishes a composable geometric control mechanism compatible with diverse continual learning paradigms, including replay, regularization, and dynamic architectures. Experiments demonstrate that AdvCL significantly outperforms existing baselines in both standard accuracy and robustness, effectively mitigating forgetting and enhancing cross-task transferability.

📝 Abstract

In dynamic environments, large language models need to keep adapting to new tasks, but continual learning often suffers from forgetting, limited transfer, and vulnerability to adversarial perturbations. To address this, we present AdvCL, which repurposes adversarial perturbations as a geometric control signal for stable continual adaptation. AdvCL combines three plug-in modules: Intra-Smooth promotes local smoothness via small adversarial perturbations; Proto-Clip uses similarity clipping to prevent excessive alignment to current task prototype; and Inter-Align applies directional alignment toward previous task prototype to reduce representational gaps. Experiments show consistent gains in both standard performance and robustness, with lower forgetting and stronger transfer. We further analyze key mechanisms by quantifying the sensitivity of Intra-Smooth to perturbation settings and the effect of Inter-Align on task similarity and geometric distance. In summary, the modules provide complementary gains when combined, and each can also be integrated individually into diverse CL paradigms, including replay, regularization, and dynamic architectures, thereby offering a geometric control mechanism for continual learning.

Problem

Research questions and friction points this paper is trying to address.

continual learning

catastrophic forgetting

adversarial perturbations

knowledge transfer

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial perturbations

continual learning

geometric alignment