Enhancing knowledge retention for continual learning with domain-specific adapters and features gating

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address catastrophic forgetting in cross-domain continual learning, this paper proposes a domain-aware adapter architecture tailored for Vision Transformers (ViTs). It introduces lightweight adapters into ViT self-attention layers and integrates domain-specific feature gating with dynamic multi-head output routing, explicitly modeling how task ordering modulates knowledge stability. Unlike conventional single-domain continual learning paradigms, the approach enables parameter-efficient and structurally controllable cross-domain knowledge retention. Evaluated on a heterogeneous sequential benchmark comprising CIFAR-100, Flowers102, and DTD, it achieves an average accuracy gain of over 8% compared to state-of-the-art parameter-efficient fine-tuning (PEFT) methods. The results demonstrate significantly mitigated forgetting and improved generalization, validating the effectiveness of co-designing task-ordering strategies with domain-aware architectural mechanisms.

Technology Category

Application Category

📝 Abstract

Continual learning empowers models to learn from a continuous stream of data while preserving previously acquired knowledge, effectively addressing the challenge of catastrophic forgetting. In this study, we propose a new approach that integrates adapters within the self-attention mechanisms of Vision Transformers to enhance knowledge retention when sequentially adding datasets from different domains. Unlike previous methods that continue learning with only one dataset, our approach introduces domain-specific output heads and feature gating, allowing the model to maintain high accuracy on previously learned tasks while incorporating only the essential information from multiple domains. The proposed method is compared to prominent parameter-efficient fine-tuning methods in the current state of the art. The results provide evidence that our method effectively alleviates the limitations of previous works. Furthermore, we conduct a comparative analysis using three datasets, CIFAR-100, Flowers102, and DTD, each representing a distinct domain, to investigate the impact of task order on model performance. Our findings underscore the critical role of dataset sequencing in shaping learning outcomes, demonstrating that strategic ordering can significantly improve the model's ability to adapt to evolving data distributions over time while preserving the integrity of previously learned knowledge.

Problem

Research questions and friction points this paper is trying to address.

Enhancing knowledge retention in continual learning models

Addressing catastrophic forgetting with domain-specific adapters

Investigating task order impact on model performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapters in self-attention for knowledge retention

Domain-specific heads and feature gating

Strategic dataset sequencing improves adaptation

🔎 Similar Papers

A Unified Framework for Continual Learning and Unlearning