🤖 AI Summary
Existing automated parking methods suffer from low success rates and poor generalization in complex real-world scenarios. To address this, we propose a hybrid path planner integrating Proximal Policy Optimization (PPO) with Reeds-Shepp geometric trajectory planning. Our approach features: (1) a novel collaborative architecture where Reeds-Shepp generates kinematically feasible trajectory priors to guide RL policy learning; (2) a parking difficulty classification criterion based on spatial constraints and obstacle distribution; (3) an environment-aware action masking mechanism that enhances training stability and trajectory feasibility; and (4) a Transformer encoder for fusing multi-source environmental features. Evaluated jointly in simulation and on real vehicles, our method achieves significantly higher success rates across diverse parking scenarios compared to rule-based and pure RL baselines, demonstrating strong cross-scenario generalization. The implementation is publicly available.
📝 Abstract
Automated parking stands as a highly anticipated application of autonomous driving technology. However, existing path planning methodologies fall short of addressing this need due to their incapability to handle the diverse and complex parking scenarios in reality. While non-learning methods provide reliable planning results, they are vulnerable to intricate occasions, whereas learning-based ones are good at exploration but unstable in converging to feasible solutions. To leverage the strengths of both approaches, we introduce Hybrid pOlicy Path plannEr (HOPE). This novel solution integrates a reinforcement learning agent with Reeds-Shepp curves, enabling effective planning across diverse scenarios. HOPE guides the exploration of the reinforcement learning agent by applying an action mask mechanism and employs a transformer to integrate the perceived environmental information with the mask. To facilitate the training and evaluation of the proposed planner, we propose a criterion for categorizing the difficulty level of parking scenarios based on space and obstacle distribution. Experimental results demonstrate that our approach outperforms typical rule-based algorithms and traditional reinforcement learning methods, showing higher planning success rates and generalization across various scenarios. We also conduct real-world experiments to verify the practicability of HOPE. The code for our solution is openly available on https://github.com/jiamiya/HOPE.