🤖 AI Summary
Embodied AI faces unique physical safety risks—such as object overloading or hot beverage delivery—that demand commonsense physical understanding and proactive intervention, yet current foundation models lack explicit physical safety awareness and mitigation capabilities. To address this, we introduce the first multimodal physical safety benchmark grounded in real-world injury incidents, employing high-fidelity image/video generation to simulate “safe → hazardous” state transitions. We propose an embodied safety-constrained reasoning framework that integrates system-level instructions to elicit interpretable, step-by-step safety reasoning traces. Furthermore, we leverage large multimodal models to jointly model risk detection, causal inference, and intervention decision-making. Experiments demonstrate significant improvements in constraint satisfaction rate and reliability under physical safety requirements. Our work establishes a verifiable, interpretable paradigm for evaluating and enhancing the safety of embodied AI systems.
📝 Abstract
When AI interacts with the physical world -- as a robot or an assistive agent -- new safety challenges emerge beyond those of purely ``digital AI". In such interactions, the potential for physical harm is direct and immediate. How well do state-of-the-art foundation models understand common-sense facts about physical safety, e.g. that a box may be too heavy to lift, or that a hot cup of coffee should not be handed to a child? In this paper, our contributions are three-fold: first, we develop a highly scalable approach to continuous physical safety benchmarking of Embodied AI systems, grounded in real-world injury narratives and operational safety constraints. To probe multi-modal safety understanding, we turn these narratives and constraints into photorealistic images and videos capturing transitions from safe to unsafe states, using advanced generative models. Secondly, we comprehensively analyze the ability of major foundation models to perceive risks, reason about safety, and trigger interventions; this yields multi-faceted insights into their deployment readiness for safety-critical agentic applications. Finally, we develop a post-training paradigm to teach models to explicitly reason about embodiment-specific safety constraints provided through system instructions. The resulting models generate thinking traces that make safety reasoning interpretable and transparent, achieving state of the art performance in constraint satisfaction evaluations. The benchmark will be released at https://asimov-benchmark.github.io/v2