AetheRock: An Arm-Worn Robot Teaching System for Force-Guided Vision-Tactile Learning

2026-06-08Robotics

Robotics
AI summary

The authors created AetheRock, a wearable device that collects force, vision, and touch data from a person's arm and fingers to help robots learn how to handle things better. They also developed ForceVT, a method that uses force and vision information to improve how robots understand touch, even when touch sensors vary in quality. Their experiments showed that AetheRock gathers data efficiently and ForceVT helps the robot learn reliably despite sensor differences. This work combines new hardware and software to improve robot learning for tasks that need careful touch and force sensing.

force sensingtactile sensingrobot learningwearable sensorsvisuo-tactile sensorsrepresentation learningdata efficiencysensor fusionrobot manipulationGelSlim-MiniFab
Authors
Hong Li, Yue Xu, Yihan Tang, Yankang Dong, Chenyuan Liu, Chenyang Yu, Xuyang Li, Siyuan Huang, Yujun Shen, Nan Xue, Yong-Lu Li
Abstract
Force and tactile sensing are indispensable in contact-rich manipulation. However, force-aware robot learning faces critical challenges due to the incompatible assembly of tactile and force sensors in handheld or wearable devices. To address these limitations, we first introduce AetheRock for gripper-force, vision, and tactile data collection, which is an arm-worn device featuring a modular and easily manufactured visuo-tactile sensor, GelSlim-MiniFab, at the fingertip, a resistive pressure sensor at the human finger contact region, a customized PCB module, and a wearable kit for comfortable and robust collection. Building on this, we propose ForceVT, a representation learning framework that uses force and vision to guide fidelity-agnostic tactile learning, enabling robust inference in any tactile situation. Real-world experiments show that AetheRock achieves qualified data efficiency and that ForceVT effectively alleviates inefficiencies when visuo-tactile sensors exhibit manufacturing and utilization inconsistencies. Overall, our work mitigates the limitations of gripper-force vision-tactile robot learning through innovative hardware design and algorithms.