๐ค AI Summary
This work addresses the significant performance degradation of existing physics-informed goal-conditioned reinforcement learning methods in contact-rich manipulation tasks, where hybrid contact dynamics induce non-smooth value landscapes. To overcome this challenge, the authors propose a contact-aware hierarchical physics-informed reinforcement learning framework that reliably extends Pi-GCRL to such settings for the first time. The approach integrates optimal controlโinspired inductive biases, hybrid system modeling, and a hierarchical policy architecture to selectively impose physical priors during goal-conditioned value learning. This design effectively mitigates the adverse effects of discontinuous dynamics while preserving sample efficiency. Experimental results demonstrate that the proposed method substantially improves both generalization to arbitrary goals and task success rates in environments characterized by frequent and complex contacts.
๐ Abstract
Learning to reach arbitrary goals from sparse feedback requires agents to infer a rich notion of reachability across state--goal pairs. Goal-conditioned reinforcement learning (GCRL) tackles this challenge by learning policies that generalize across goals, but this generalization becomes increasingly difficult as the underlying dynamics become high-dimensional, hybrid, or contact-dependent. To address this issue, physics-informed GCRL (Pi-GCRL) introduces optimal-control-inspired inductive biases into goal-conditioned value learning. While Pi-GCRL methods have proven effective in navigation and object-free goal-reaching domains, their reliability in contact-rich tasks remains unclear, where contact interactions induce hybrid dynamics, mode-dependent controllability, and nonsmooth value landscapes. In this work, we show that these structural properties can cause existing Pi-GCRL methods to degrade when applied naively to contact-rich manipulation. Motivated by this analysis, we introduce contact-aware and hierarchical formulations that apply physics-informed inductive biases selectively across the manipulation problem. Our results provide a principled step toward extending Pi-GCRL to contact-rich manipulation.