RoboPocket: Improve Robot Policies Instantly with Your Phone

2026-03-05Robotics

RoboticsArtificial IntelligenceMachine Learning
AI summary

The authors present RoboPocket, a system that uses smartphones and augmented reality to help people teach robots more efficiently without needing a physical robot. By showing where the robot's current plan might fail, users can focus on improving weak spots in the robot's behavior during data collection. This approach lets the robot’s learning update quickly and needs less data compared to traditional offline methods. Their experiments show it speeds up the learning process and works well even with few corrections from users.

Imitation LearningData CollectionCovariate ShiftDAggerAugmented RealityRemote InferenceOnline FinetuningPolicy IterationSample Efficiency
Authors
Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu
Abstract
Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2$\times$ in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.