Making Embodied AI Reliable: A Community Agenda from Testing to Formal Verification
2026-06-02 • Software Engineering
Software EngineeringRobotics
AI summaryⓘ
The authors discuss how making embodied AI systems reliable is hard because these systems face unpredictable environments and human interactions. They highlight three main ways to improve reliability: testing scenarios that are well-defined, verifying system parts using symbolic models, and having mechanisms that adapt when things change during use. The authors suggest combining these methods into a continuous process using shared representations and feedback. This integrated approach aims to help embodied AI systems work safely in the real world.
Embodied AIReliabilityScenario-Based TestingFormal VerificationRuntime AssuranceNeuro-Symbolic RepresentationsSystem LifecycleDistribution ShiftsCompositional Verification
Authors
Xi Zheng, Dulanga Weerakoon, Yintong Huo, Teresa Yeo, Guy Van Den Broeck, Vijay Ganesh, Daniel Neider, Biplav Srivastava, Ivan Ruchkin, Archan Misra, Corina Pasareanu
Abstract
Embodied AI systems are increasingly deployed in open-world environments, yet ensuring their reliability remains a fundamental challenge. Drawing on discussions from the AAAI'26 Bridge Program on "Making Embodied AI Reliable with Testing and Formal Verification", this article argues that reliability in embodied AI is inherently a lifecycle assurance problem arising from uncertainty, human interaction, and emergent behaviors across tightly coupled system components. We identify three complementary directions toward reliable embodied AI: (1) trustworthy scenario-based testing supported by validated specifications and meaningful coverage metrics, (2) compositional verification enabled by structured symbolic representations of system behavior and environmental context, and (3) runtime assurance mechanisms capable of adapting to uncertainty and distribution shifts during deployment. Rather than treating these approaches independently, we advocate integrated assurance workflows that connect testing, verification, and runtime adaptation through shared neuro-symbolic representations and continuous feedback across the system lifecycle. Such integration provides a foundation for building trustworthy embodied AI systems that can operate safely and reliably in complex real-world environments.