Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

📅 2025-01-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the susceptibility of robotic safety policies to subjectivity, omission of edge cases, and insufficient generalization of safety data, this paper proposes a safety boundary learning framework leveraging sparse human binary feedback (“unsafe”/“safe”). The method models the safety-critical latent space, evaluates vision–motion policies, and augments Model Predictive Control (MPC) with learned safety constraints. Its core contribution is the first adaptation of conformal prediction to sparse human feedback for estimating safety sets in latent space—eliminating the need for data retention and enabling sample-efficient, nearest-neighbor-driven online safety monitoring and policy refinement. Evaluated across 30 quadrotor flight experiments spanning six navigation tasks, the framework significantly improves MPC safety. Video annotation validation confirms accurate early warning of gate-passing failures, with false-positive rates rigorously bounded by the conformal prediction framework.

Technology Category

Application Category

📝 Abstract
Ensuring robot safety can be challenging; user-defined constraints can miss edge cases, policies can become unsafe even when trained from safe data, and safety can be subjective. Thus, we learn about robot safety by showing policy trajectories to a human who flags unsafe behavior. From this binary feedback, we use the statistical method of conformal prediction to identify a region of states, potentially in learned latent space, guaranteed to contain a user-specified fraction of future policy errors. Our method is sample-efficient, as it builds on nearest neighbor classification and avoids withholding data as is common with conformal prediction. By alerting if the robot reaches the suspected unsafe region, we obtain a warning system that mimics the human's safety preferences with guaranteed miss rate. From video labeling, our system can detect when a quadcopter visuomotor policy will fail to steer through a designated gate. We present an approach for policy improvement by avoiding the suspected unsafe region. With it we improve a model predictive controller's safety, as shown in experimental testing with 30 quadcopter flights across 6 navigation tasks. Code and videos are provided.
Problem

Research questions and friction points this paper is trying to address.

Learning robot safety from human feedback
Using conformal prediction for error detection
Improving safety in quadcopter navigation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal prediction for safety
Nearest neighbor classification
Visuomotor policy improvement
🔎 Similar Papers
No similar papers found.
A
Aaron O. Feldman
Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305, USA
J
Joseph A. Vincent
Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305, USA
Maximilian Adang
Maximilian Adang
PhD Candidate, Aeronautics & Astronautics @ Stanford University
RoboticsPerceptionDynamics and Control
J
JunEn Low
Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 94305, USA
Mac Schwager
Mac Schwager
Stanford University
RoboticsControlMulti-Agent SystemsMachine LearningStatistical Inference and Estimation