Safe Planning and Policy Optimization via World Model Learning

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

In safety-critical reinforcement learning, inaccurate world models often lead to catastrophic failures. Method: We propose a model-based framework that jointly optimizes task performance and safety via (1) an adaptive planning-execution switching mechanism to mitigate error accumulation from model inaccuracies; (2) an implicit world model to reduce modeling bias; and (3) a capability-aware dynamic safety threshold, replacing rigid static constraints. Contribution/Results: Our work introduces the first synergistic integration of dynamic mode switching and implicit modeling to resolve objective misalignment and transcend conventional safety-constrained paradigms. Evaluated on diverse continuous-control benchmarks, our approach significantly outperforms non-adaptive baselines—achieving superior task performance while rigorously maintaining safety and preventing catastrophic failures. Empirical results validate the feasibility and robustness of co-optimizing safety and performance in safety-critical RL.

Technology Category

Application Category

📝 Abstract

Reinforcement Learning (RL) applications in real-world scenarios must prioritize safety and reliability, which impose strict constraints on agent behavior. Model-based RL leverages predictive world models for action planning and policy optimization, but inherent model inaccuracies can lead to catastrophic failures in safety-critical settings. We propose a novel model-based RL framework that jointly optimizes task performance and safety. To address world model errors, our method incorporates an adaptive mechanism that dynamically switches between model-based planning and direct policy execution. We resolve the objective mismatch problem of traditional model-based approaches using an implicit world model. Furthermore, our framework employs dynamic safety thresholds that adapt to the agent's evolving capabilities, consistently selecting actions that surpass safe policy suggestions in both performance and safety. Experiments demonstrate significant improvements over non-adaptive methods, showing that our approach optimizes safety and performance simultaneously rather than merely meeting minimum safety requirements. The proposed framework achieves robust performance on diverse safety-critical continuous control tasks, outperforming existing methods.

Problem

Research questions and friction points this paper is trying to address.

Ensures RL agent safety and reliability in real-world scenarios

Addresses model inaccuracies in model-based RL planning

Optimizes task performance and safety simultaneously

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive switching between model and policy

Implicit world model for objective alignment

Dynamic safety thresholds for evolving capabilities

🔎 Similar Papers

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning