🤖 AI Summary
Existing embodiment-agnostic policies rely solely on end-effector actions, leading to fragile deployment performance due to neglecting robot body constraints such as whole-body collisions. This work proposes a training-free, inference-time guidance framework that preserves visuomotor policy learning in Cartesian space while mapping diffusion model samples into joint space via forward kinematics and Jacobian matrices. At each denoising step, trajectory guidance incorporating whole-body collision awareness is applied. This approach enables, for the first time, zero-shot cross-embodiment deployment without retraining, seamlessly integrating proprioceptive constraints with task behavior to jointly satisfy end-effector semantics and joint-level safety. Experiments demonstrate a 46.1% reduction in collision rate and a 28.5% increase in task success across nine simulated robots; on two physical robots in constrained settings, collisions decrease by 90.0% and success rates improve by 36.7%.
📝 Abstract
Scalable robot imitation learning relies on large-scale heterogeneous data from diverse robots or body-free data, making Cartesian end-effector actions a key interface for embodiment-agnostic policy learning. However, end-effector-only abstraction leaves Cartesian policies unaware of the deployed robot body, making them brittle under robot-specific constraints such as whole-body collision avoidance. To overcome this limitation, we present EmbodiSteer, a training-free framework that steers embodiment-agnostic visuomotor policies toward zero-shot, embodiment-aware deployment. EmbodiSteer keeps policy learning in Cartesian space while efficiently lifting inference-time diffusion sampling into the target robot's joint space via forward kinematics and Jacobian-based updates. With whole-body collision-aware guidance over joint trajectories after each denoising step, the arm can be steered away from collisions while preserving learned end-effector behavior. Compared with Cartesian-only execution, EmbodiSteer reduces collision rate by 46.1% and improves task success rate by 28.5% across 9 simulated robots, and further achieves 90.0% collision rate reduction and 36.7% success rate increase on two physical robots in highly constrained scenarios. Our project page is at https://frankwang67.github.io/EmbodiSteer-Page.