π€ AI Summary
This work addresses a critical gap in human-robot interaction, where existing safety filters either neglect online learning capabilities or lack formally verifiable safety guarantees. The authors propose a belief-space neural safety filter grounded in trustworthy reasoning, which, for the first time, integrates conformal prediction into this framework. By combining runtime Bayesian inference with formal verification, the method provides high-probability safety assurances while significantly reducing conservatism. It explicitly models the reliability of the reasoning module and focuses verification efforts on high-confidence regions of the belief space. In humanβvehicle interaction simulations, the approach yields substantially less conservative decision boundaries than standard baselines, yet retains rigorous theoretical safety guarantees.
π Abstract
Autonomous robots that interact with people must make safe and efficient decisions under human-induced uncertainty, such as their preferences, goals, competency, and willingness to cooperate. Safety filters are a popular approach for ensuring safety in interactive robotics, since their modular design separates safety from performance, allowing robots to operate safely around people with minimal impact on task efficiency. While traditional safety filters typically operate only in the physical space, neglecting the robot's ability to learn and adapt online, the recently proposed belief-space safety filter (BeliefSF) reasons about robot safety in closed-loop with runtime inference that actively reduces the robot's uncertainty online, thereby reducing conservativeness in filtering. However, providing formal safety guarantees for robots deploying BeliefSF remains a significant challenge due to errors in runtime inference and neural approximation of safety filters required to handle the high dimensionality of belief spaces. In this paper, we propose an algorithmic approach to certify high-probability safety of BeliefSF using conformal prediction, while explicitly accounting for the reliability of the robot's runtime inference module. Our method leverages the structure of belief-space safety filtering by focusing verification on a region where inference is expected to be reliable. It preserves the simplicity and sample complexity of standard conformal prediction, yet can certify a substantially less conservative safety filter. Through a simulated human-vehicle interaction benchmark, we show that our approach verifies a significantly more permissive belief-space safety filter than a standard conformal prediction baseline.