P2DT: Mitigating Forgetting in Task-Incremental Learning with Progressive Prompt Decision Transformer

📅 2024-01-22

🏛️ IEEE International Conference on Acoustics, Speech, and Signal Processing

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

To address catastrophic forgetting in task-incremental reinforcement learning caused by poor control of large models, this paper proposes the Progressive Prompt-based Decision Transformer (P2DT). Methodologically, P2DT leverages a Transformer architecture integrated with prompt engineering and trajectory distillation, enabling cross-task knowledge consolidation without fine-tuning the backbone network. Its key contributions are: (1) a novel dynamic, expandable decision-token mechanism that supports continual policy evolution without parameter backpropagation; and (2) synergistic integration of offline RL trajectories and self-generated prompts for effective knowledge retention. Evaluated on multi-task RL benchmarks, P2DT achieves an average performance improvement of 37% over baselines while demonstrating strong scalability as the number of tasks increases. The approach significantly mitigates forgetting without compromising architectural efficiency or requiring model retraining.

Technology Category

Application Category

📝 Abstract

Catastrophic forgetting poses a substantial challenge for managing intelligent agents controlled by a large model, causing performance degradation when these agents face new tasks. In our work, we propose a novel solution - the Progressive Prompt Decision Transformer (P2DT). This method enhances a transformer-based model by dynamically appending decision tokens during new task training, thus fostering task-specific policies. Our approach mitigates forgetting in continual and offline reinforcement learning scenarios. Moreover, P2DT leverages trajectories collected via traditional reinforcement learning from all tasks and generates new taskspecific tokens during training, thereby retaining knowledge from previous studies. Preliminary results demonstrate that our model effectively alleviates catastrophic forgetting and scales well with increasing task environments.

Problem

Research questions and friction points this paper is trying to address.

Mitigating catastrophic forgetting in task-incremental learning scenarios

Addressing performance degradation when agents face new tasks

Retaining knowledge from previous tasks in continual reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive Prompt Decision Transformer for task-incremental learning

Dynamically appends decision tokens during new task training

Leverages trajectories from all tasks to retain knowledge

🔎 Similar Papers

No similar papers found.

Authors to Follow