RoSHI: A Versatile Robot-oriented Suit for Human Data In-the-Wild

2026-04-08Robotics

RoboticsArtificial IntelligenceComputer Vision and Pattern Recognition
AI summary

The authors created a new wearable system called RoSHI that combines two types of sensors: small motion sensors on the body and smart glasses with cameras. This helps them track a person's full body movement and shape in 3D while they move naturally, without needing external cameras. Their system works well even when body parts are hidden or moving fast and keeps consistent tracking over time. They tested it with active movements and found it better than other similar wearable methods and close to the best outside-camera systems. Their data can also help teach robots to move more like humans in real-world tasks.

robot learningegocentric perceptionIMU (Inertial Measurement Unit)SLAM (Simultaneous Localization and Mapping)3D pose estimationbody shape reconstructionwearable sensorsmotion capturehumanoid policy learningProject Aria glasses
Authors
Wenjing Margaret Mao, Jefferson Ng, Luyang Hu, Daniel Gehrig, Antonio Loquercio
Abstract
Scaling up robot learning will likely require human data containing rich and long-horizon interactions in the wild. Existing approaches for collecting such data trade off portability, robustness to occlusion, and global consistency. We introduce RoSHI, a hybrid wearable that fuses low-cost sparse IMUs with the Project Aria glasses to estimate the full 3D pose and body shape of the wearer in a metric global coordinate frame from egocentric perception. This system is motivated by the complementarity of the two sensors: IMUs provide robustness to occlusions and high-speed motions, while egocentric SLAM anchors long-horizon motion and stabilizes upper body pose. We collect a dataset of agile activities to evaluate RoSHI. On this dataset, we generally outperform other egocentric baselines and perform comparably to a state-of-the-art exocentric baseline (SAM3D). Finally, we demonstrate that the motion data recorded from our system are suitable for real-world humanoid policy learning. For videos, data and more, visit the project webpage: https://roshi-mocap.github.io/