IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning

2026-03-20Robotics

RoboticsMultiagent Systems
AI summary

The authors created IndoorR2X, a system that helps multiple robots work together better indoors by using not only each robot's sensors but also existing Internet of Things (IoT) devices like cameras in the building. This combined approach helps the robots understand the whole scene more efficiently without needing to explore as much. They use Large Language Models (LLMs) to plan tasks and coordinate the robot team based on this shared information. Their tests show that using IoT devices makes robot teamwork faster and more reliable, and they also discuss challenges that still need solving.

Robot-to-Robot (R2R) communicationInternet of Things (IoT)Large Language Models (LLMs)Multi-robot task planningIndoor scene understandingSemantic stateSimulation frameworkRobot coordinationExploration overheadSensor fusion
Authors
Fan Yang, Soumya Teotia, Shaunak A. Mehta, Prajit KrisshnaKumar, Quanting Xie, Jun Liu, Yueqi Song, Li Wenkai, Atsunori Moteki, Kanji Uchino, Yonatan Bisk
Abstract
Although robot-to-robot (R2R) communication improves indoor scene understanding beyond what a single robot can achieve, R2R alone cannot overcome partial observability without substantial exploration overhead or scaling team size. In contrast, many indoor environments already include low-cost Internet of Things (IoT) sensors (e.g., cameras) that provide persistent, building-wide context beyond onboard perception. We therefore introduce IndoorR2X, the first benchmark and simulation framework for Large Language Model (LLM)-driven multi-robot task planning with Robot-to-Everything (R2X) perception and communication in indoor environments. IndoorR2X integrates observations from mobile robots and static IoT devices to construct a global semantic state that supports scalable scene understanding, reduces redundant exploration, and enables high-level coordination through LLM-based planning. IndoorR2X provides configurable simulation environments, sensor layouts, robot teams, and task suites to systematically evaluate high-level semantic coordination strategies. Extensive experiments across diverse settings demonstrate that IoT-augmented world modeling improves multi-robot efficiency and reliability, and we highlight key insights and failure modes for advancing LLM-based collaboration between robot teams and indoor IoT sensors.