Permissive Safety Through Trusted Inference: Verifiable Belief-Space Neural Safety Filters for Assured Interactive Robotics
2026-06-01 • Robotics
RoboticsArtificial IntelligenceMachine Learning
AI summaryⓘ
The authors address how robots can safely interact with humans despite uncertainties like human goals and behavior. They focus on a method called belief-space safety filtering, which helps robots learn and adapt in real-time to reduce overly cautious safety behavior. To ensure this method is truly safe, the authors develop a way to mathematically verify it using a technique called conformal prediction, which checks the reliability of the robot’s decision-making process. Their approach allows robots to operate more efficiently while still being safe, as shown in tests with simulated human-vehicle interactions.
belief-space safety filterruntime inferenceconformal predictionsafety verificationhuman-robot interactionneural approximationuncertaintyclosed-loop controlsample complexityrobot decision-making
Authors
Haimin Hu
Abstract
Autonomous robots that interact with people must make safe and efficient decisions under human-induced uncertainty, such as their preferences, goals, competency, and willingness to cooperate. Safety filters are a popular approach for ensuring safety in interactive robotics, since their modular design separates safety from performance, allowing robots to operate safely around people with minimal impact on task efficiency. While traditional safety filters typically operate only in the physical space, neglecting the robot's ability to learn and adapt online, the recently proposed belief-space safety filter (BeliefSF) reasons about robot safety in closed-loop with runtime inference that actively reduces the robot's uncertainty online, thereby reducing conservativeness in filtering. However, providing formal safety guarantees for robots deploying BeliefSF remains a significant challenge due to errors in runtime inference and neural approximation of safety filters required to handle the high dimensionality of belief spaces. In this paper, we propose an algorithmic approach to certify high-probability safety of BeliefSF using conformal prediction, while explicitly accounting for the reliability of the robot's runtime inference module. Our method leverages the structure of belief-space safety filtering by focusing verification on a region where inference is expected to be reliable. It preserves the simplicity and sample complexity of standard conformal prediction, yet can certify a substantially less conservative safety filter. Through a simulated human-vehicle interaction benchmark, we show that our approach verifies a significantly more permissive belief-space safety filter than a standard conformal prediction baseline.