🤖 AI Summary
To address plasticity degradation—i.e., loss of network adaptability due to parameter rigidity—in continual learning of neural networks, this paper proposes Activation Interval Dropout (AID), the first dropout variant operating at the pre-activation interval level. AID dynamically assigns dropout probabilities based on the position of activation intervals, requiring no additional parameters or task-specific tuning. Theoretically, we prove that AID induces quasi-deep-linear behavior in neural networks, mechanistically mitigating plasticity decay. Empirically, AID improves average accuracy by 3.2–5.7% across continual learning benchmarks (CIFAR-10/100, TinyImageNet). In reinforcement learning settings (Arcade Learning Environment), it significantly enhances performance on 12 out of 16 tasks, effectively alleviating catastrophic forgetting.
📝 Abstract
Plasticity loss, a critical challenge in neural network training, limits a model's ability to adapt to new tasks or shifts in data distribution. This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout, designed to address plasticity loss. Unlike Dropout, AID generates subnetworks by applying Dropout with different probabilities on each preactivation interval. Theoretical analysis reveals that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss. We validate the effectiveness of AID in maintaining plasticity across various benchmarks, including continual learning tasks on standard image classification datasets such as CIFAR10, CIFAR100, and TinyImageNet. Furthermore, we show that AID enhances reinforcement learning performance in the Arcade Learning Environment benchmark.