Reward-Augmented Reinforcement Learning for Continuous Control in Precision Autonomous Parking via Policy Optimization Methods

📅 2025-07-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Autonomous parking (AP) faces challenges including nonlinear vehicle dynamics, environmental sensitivity, and stringent safety constraints, leading to poor generalization and unsmooth control in conventional methods. To address these issues, this paper proposes the Reward-Augmented Reinforcement Learning for Autonomous Parking (RARLAP) framework. Its core innovation is the Milestone-Augmented Reward (MAR) mechanism, which jointly integrates goal-directed rewards, dense proximity-based feedback, and phased milestone rewards to enhance policy smoothness, convergence speed, and robustness. RARLAP is compatible with both on-policy and off-policy reinforcement learning algorithms. Evaluated in a high-fidelity Unity 3D simulation, the MAR-enhanced PPO agent achieves a 91% parking success rate—significantly outperforming GOR and DPR baselines—while generating smoother trajectories, exhibiting improved training stability, and demonstrating enhanced safety compliance and cross-scenario generalization capability.

Technology Category

Application Category

📝 Abstract
Autonomous parking (AP) represents a critical yet complex subset of intelligent vehicle automation, characterized by tight spatial constraints, frequent close-range obstacle interactions, and stringent safety margins. However, conventional rule-based and model-predictive methods often lack the adaptability and generalization needed to handle the nonlinear and environment-dependent complexities of AP. To address these limitations, we propose a reward-augmented learning framework for AP (RARLAP), that mitigates the inherent complexities of continuous-domain control by leveraging structured reward design to induce smooth and adaptable policy behavior, trained entirely within a high-fidelity Unity-based custom 3D simulation environment. We systematically design and assess three structured reward strategies: goal-only reward (GOR), dense proximity reward (DPR), and milestone-augmented reward (MAR), each integrated with both on-policy and off-policy optimization paradigms. Empirical evaluations demonstrate that the on-policy MAR achieves a 91% success rate, yielding smoother trajectories and more robust behavior, while GOR and DPR fail to guide effective learning. Convergence and trajectory analyses demonstrate that the proposed framework enhances policy adaptability, accelerates training, and improves safety in continuous control. Overall, RARLAP establishes that reward augmentation effectively addresses complex autonomous parking challenges, enabling scalable and efficient policy optimization with both on- and off-policy methods. To support reproducibility, the code accompanying this paper is publicly available.
Problem

Research questions and friction points this paper is trying to address.

Addresses complex autonomous parking with tight spatial constraints
Improves adaptability in continuous control via reward-augmented learning
Enhances policy optimization for smoother and safer parking trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reward-augmented learning framework for autonomous parking
Structured reward design for smooth policy behavior
High-fidelity Unity-based 3D simulation environment
🔎 Similar Papers
No similar papers found.