PACT: Self-Evolving Physical Safety Alignment for Diffusion Policies in Embodied Manipulation

📅 2026-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion-based policies in robotic manipulation often struggle to simultaneously satisfy physical safety constraints and achieve high task performance. Existing approaches either impose constraints too early, limiting policy expressiveness, or rely on external safeguards during deployment, hindering scalability. This work proposes PACT, a framework that aligns pretrained diffusion policies with physical constraints through a self-evolving post-training projection mechanism, without requiring demonstration data or task-specific rewards. By integrating reverse KL optimization, constraint-aware gradient distillation, and progressive curriculum learning, PACT ensures theoretically grounded monotonic policy improvement. Experiments demonstrate that PACT reduces safety violations by 31.0% on average and improves task success rates by 30.7% across both simulated and real-world manipulation tasks.
📝 Abstract
Diffusion policies have achieved remarkable success in robotic manipulation, yet they often fail to satisfy strict physical constraints required for safe deployment. Existing approaches impose safety either prematurely during training or reactively via external guardrails at test time, limiting policy expressivity and overall scalability. We propose Physical safety Alignment for Constrained Trajectories (PACT), a self-evolving post-training framework that projects pretrained diffusion policies onto constraint-feasible regions without accessing demonstration data or task rewards. PACT distills constraint gradients into the diffusion model through a reverse-KL objective with dense supervision across timesteps. It incorporates a curriculum that progressively tightens constraints while maintaining theoretically bounded policy shift and monotone improvement, mitigating the safety-performance trade-off from catastrophic forgetting. On simulated and real-world embodied manipulation benchmarks, PACT significantly reduces safety violations by 31.0% on average while improving task success by 30.7%.
Problem

Research questions and friction points this paper is trying to address.

diffusion policies
physical safety
embodied manipulation
constraint satisfaction
safe deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion policies
physical safety alignment
constraint distillation
self-evolving framework
curriculum learning
🔎 Similar Papers
No similar papers found.