Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks

📅 2025-02-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Meta-reinforcement learning (meta-RL) suffers from poor out-of-distribution (OOD) generalization, primarily due to task representations being sensitive to distributional shifts. To address this, we propose a task-aware virtual training framework comprising three key components: (1) a transferable task embedding space built via metric learning; (2) a task-aware virtual task generation mechanism that explicitly enforces representational consistency between training and OOD tasks; and (3) state regularization to mitigate value overestimation under dynamic state distributions. This is the first work to jointly integrate task-aware virtual task construction with state regularization in meta-RL. Evaluated on MuJoCo and MetaWorld benchmarks, our framework achieves an average 23.6% improvement in OOD task performance and a 41% increase in generalization stability, significantly outperforming existing meta-RL methods.

Technology Category

Application Category

📝 Abstract

Meta reinforcement learning aims to develop policies that generalize to unseen tasks sampled from a task distribution. While context-based meta-RL methods improve task representation using task latents, they often struggle with out-of-distribution (OOD) tasks. To address this, we propose Task-Aware Virtual Training (TAVT), a novel algorithm that accurately captures task characteristics for both training and OOD scenarios using metric-based representation learning. Our method successfully preserves task characteristics in virtual tasks and employs a state regularization technique to mitigate overestimation errors in state-varying environments. Numerical results demonstrate that TAVT significantly enhances generalization to OOD tasks across various MuJoCo and MetaWorld environments.

Problem

Research questions and friction points this paper is trying to address.

Improves generalization in meta-reinforcement learning

Addresses out-of-distribution task challenges

Utilizes metric-based representation learning for task characteristics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-Aware Virtual Training

metric-based representation learning

state regularization technique

🔎 Similar Papers

No similar papers found.

Authors to Follow