Push Smarter, Not Harder: Hierarchical RL-Diffusion Policy for Efficient Nonprehensile Manipulation

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Non-prehensile object manipulation in cluttered environments faces challenges including complex contact dynamics and difficulty in long-horizon planning. This paper proposes a hierarchical control architecture: a high-level reinforcement learning (PPO) agent generates semantic intermediate goals, while a low-level goal-conditioned diffusion model synthesizes physically feasible and efficient trajectories in real time. To our knowledge, this is the first work to synergistically integrate reinforcement learning with conditional diffusion models for non-prehensile manipulation, enabling goal-directed, decoupled control. Evaluated in a 2D physics simulator, our method achieves a success rate of 92.3%, reduces path length by 37% compared to state-of-the-art methods, and demonstrates strong generalization and scalability across diverse obstacle configurations. These results significantly enhance the practicality and robustness of non-prehensile manipulation systems.

Technology Category

Application Category

📝 Abstract
Nonprehensile manipulation, such as pushing objects across cluttered environments, presents a challenging control problem due to complex contact dynamics and long-horizon planning requirements. In this work, we propose HeRD, a hierarchical reinforcement learning-diffusion policy that decomposes pushing tasks into two levels: high-level goal selection and low-level trajectory generation. We employ a high-level reinforcement learning (RL) agent to select intermediate spatial goals, and a low-level goal-conditioned diffusion model to generate feasible, efficient trajectories to reach them. This architecture combines the long-term reward maximizing behaviour of RL with the generative capabilities of diffusion models. We evaluate our method in a 2D simulation environment and show that it outperforms the state-of-the-art baseline in success rate, path efficiency, and generalization across multiple environment configurations. Our results suggest that hierarchical control with generative low-level planning is a promising direction for scalable, goal-directed nonprehensile manipulation. Code, documentation, and trained models are available: https://github.com/carosteven/HeRD.
Problem

Research questions and friction points this paper is trying to address.

Hierarchical RL-diffusion policy for nonprehensile object pushing
Decomposes tasks into goal selection and trajectory generation
Improves success, efficiency, and generalization in cluttered environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical RL-diffusion policy for manipulation
High-level RL selects intermediate spatial goals
Low-level diffusion model generates efficient trajectories
🔎 Similar Papers
No similar papers found.