EgoGuide: Egocentric Guidance for Efficient Robot-Free Demonstration Collection and Learning

2026-06-12Robotics

Robotics
AI summary

The authors introduce EgoGuide, a new system to help robots learn from human demonstrations more efficiently by using synchronized views from a wrist camera and an egocentric head camera. Their method gives real-time feedback on data quality to avoid collecting unnecessary examples. They also propose a special policy that combines information from both cameras to handle changes in viewpoint and occlusions better. Experiments show that EgoGuide needs fewer demonstration episodes and improves learning reliability.

Robot learningDemonstrationsEgocentric cameraVisual-geometric dataData efficiencyPolicy learningViewpoint variationOcclusion robustnessUniversal Manipulation Interface (UMI)
Authors
Yue Xu, Mingtao Nie, Tianle Li, Hong Li, Yibo Luo, Siyuan Huang, Yong-Lu Li
Abstract
Robot learning from real-world demonstrations is currently constrained by data scaling. Universal Manipulation Interface (UMI) provides an efficient robot-free data collection interface, yet current UMI-style pipelines often collect redundant demonstrations and lack global scene context. To improve data efficiency, we present EgoGuide, a collection interface that records synchronized wrist and head/egocentric observations and couples them with online visual-geometric data quality guidance. We also introduce a Gated Egocentric Residual Policy for robust learning from a viewpoint-varying egocentric camera, allowing head/egocentric context to correct ambiguous local observations while preserving stable wrist-view control. Real-world experiments show that EgoGuide reduces the required number of data episodes and improves data efficiency. The residual policy further improves robustness under visual occlusion. Project Page: https://silicx.github.io/EgoGuide