🤖 AI Summary
This paper investigates a dual dilemma in deploying vision-language models (VLMs) for safe physical reasoning in force-controlled robotics: while existing behavioral guardrails effectively suppress harmful interactions, they simultaneously impair beneficial, human-in-the-loop haptic manipulation. Through VLM-driven force-generation modeling, dual-case safety intervention experiments, and analysis of high-contact-density human–robot interaction scenarios, we provide the first empirical evidence of a fundamental trade-off between value alignment and physical operational capability. We find that prevailing safety mechanisms—designed for “zero-risk” operation—significantly inhibit both harmful *and* beneficial contact behaviors, thereby challenging current safety evaluation paradigms. Consequently, we argue for a new assessment framework that jointly optimizes safety, functional utility, and ethical adaptability. This work establishes theoretical foundations and practical guidelines for the responsible advancement of VLM-augmented embodied intelligence.
📝 Abstract
Humans learn how and when to apply forces in the world via a complex physiological and psychological learning process. Attempting to replicate this in vision-language models (VLMs) presents two challenges: VLMs can produce harmful behavior, which is particularly dangerous for VLM-controlled robots which interact with the world, but imposing behavioral safeguards can limit their functional and ethical extents. We conduct two case studies on safeguarding VLMs which generate forceful robotic motion, finding that safeguards reduce both harmful and helpful behavior involving contact-rich manipulation of human body parts. Then, we discuss the key implication of this result--that value alignment may impede desirable robot capabilities--for model evaluation and robot learning.