HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios

📅 2024-05-31
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Existing automated parking methods suffer from low success rates and poor generalization in complex real-world scenarios. To address this, we propose a hybrid path planner integrating Proximal Policy Optimization (PPO) with Reeds-Shepp geometric trajectory planning. Our approach features: (1) a novel collaborative architecture where Reeds-Shepp generates kinematically feasible trajectory priors to guide RL policy learning; (2) a parking difficulty classification criterion based on spatial constraints and obstacle distribution; (3) an environment-aware action masking mechanism that enhances training stability and trajectory feasibility; and (4) a Transformer encoder for fusing multi-source environmental features. Evaluated jointly in simulation and on real vehicles, our method achieves significantly higher success rates across diverse parking scenarios compared to rule-based and pure RL baselines, demonstrating strong cross-scenario generalization. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Automated parking stands as a highly anticipated application of autonomous driving technology. However, existing path planning methodologies fall short of addressing this need due to their incapability to handle the diverse and complex parking scenarios in reality. While non-learning methods provide reliable planning results, they are vulnerable to intricate occasions, whereas learning-based ones are good at exploration but unstable in converging to feasible solutions. To leverage the strengths of both approaches, we introduce Hybrid pOlicy Path plannEr (HOPE). This novel solution integrates a reinforcement learning agent with Reeds-Shepp curves, enabling effective planning across diverse scenarios. HOPE guides the exploration of the reinforcement learning agent by applying an action mask mechanism and employs a transformer to integrate the perceived environmental information with the mask. To facilitate the training and evaluation of the proposed planner, we propose a criterion for categorizing the difficulty level of parking scenarios based on space and obstacle distribution. Experimental results demonstrate that our approach outperforms typical rule-based algorithms and traditional reinforcement learning methods, showing higher planning success rates and generalization across various scenarios. We also conduct real-world experiments to verify the practicability of HOPE. The code for our solution is openly available on https://github.com/jiamiya/HOPE.
Problem

Research questions and friction points this paper is trying to address.

Addresses diverse and complex parking scenarios
Combines reinforcement learning with Reeds-Shepp curves
Improves planning success rates and generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning with Reeds-Shepp curves
Action mask mechanism for guided exploration
Transformer for environmental information integration
🔎 Similar Papers
No similar papers found.
Mingyang Jiang
Mingyang Jiang
Shanghai Jiao Tong University
roboticsintelligent vehiclemachine learning
Yueyuan Li
Yueyuan Li
Department of Automation, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, CN
Songan Zhang
Songan Zhang
Global Institute of Future Technology, Shanghai Jiao Tong University
Autonomous VehicleRoboticsAI
C
Chunxiang Wang
Department of Automation, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, CN
M
Ming Yang
Department of Automation, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, CN