Dynamics-Aligned Latent Imagination in Contextual World Models for Zero-Shot Generalization

📅 2025-08-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world reinforcement learning demands zero-shot generalization to unseen environments (e.g., unknown friction or gravity), yet existing contextual Markov decision process (cMDP) methods rely on explicit, measurable context variables and fail under implicit or unobservable dynamics. Method: We propose DALI—the first framework integrating dynamic alignment and implicit context modeling within the Dreamer architecture. DALI employs a self-supervised encoder to predict forward dynamics, learning causal-consistent, counterfactual-reasonable latent representations. It introduces a latent world model coupled with cross-context consistency constraints to enable context-aware planning and policy learning. Contribution/Results: On multiple cMDP benchmarks, DALI significantly outperforms context-agnostic baselines and, remarkably, surpasses explicit-context methods on extrapolation tasks—demonstrating superior zero-shot transfer to novel dynamics without access to observable context signals.

Technology Category

Application Category

📝 Abstract
Real-world reinforcement learning demands adaptation to unseen environmental conditions without costly retraining. Contextual Markov Decision Processes (cMDP) model this challenge, but existing methods often require explicit context variables (e.g., friction, gravity), limiting their use when contexts are latent or hard to measure. We introduce Dynamics-Aligned Latent Imagination (DALI), a framework integrated within the Dreamer architecture that infers latent context representations from agent-environment interactions. By training a self-supervised encoder to predict forward dynamics, DALI generates actionable representations conditioning the world model and policy, bridging perception and control. We theoretically prove this encoder is essential for efficient context inference and robust generalization. DALI's latent space enables counterfactual consistency: Perturbing a gravity-encoding dimension alters imagined rollouts in physically plausible ways. On challenging cMDP benchmarks, DALI achieves significant gains over context-unaware baselines, often surpassing context-aware baselines in extrapolation tasks, enabling zero-shot generalization to unseen contextual variations.
Problem

Research questions and friction points this paper is trying to address.

Adapting to unseen environmental conditions without retraining
Inferring latent context representations from agent interactions
Achieving zero-shot generalization in contextual MDPs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent context inference from interactions
Self-supervised encoder predicting forward dynamics
Counterfactual-consistent latent imagination space
🔎 Similar Papers
No similar papers found.