One Hand to Rule Them All: Canonical Representations for Unified Dexterous Manipulation

2026-02-18 • Robotics

Robotics

AI summaryⓘ

The authors created a new way to describe many different robot hands using a shared format and set of parameters. This helps machines learn how to control various hands without retraining for each specific design. They tested their method by training a robot to grasp objects with hands it hadn't seen before and got good results. Their work aims to make robot hand control more flexible and efficient across different designs.

dexterous manipulationrobot hand designparameterized representationURDFvariational autoencoder (VAE)latent spacezero-shot transfergrasping policycross-embodiment learning

Authors

Zhenyu Wei, Yunchao Yao, Mingyu Ding

Abstract

Dexterous manipulation policies today largely assume fixed hand designs, severely restricting their generalization to new embodiments with varied kinematic and structural layouts. To overcome this limitation, we introduce a parameterized canonical representation that unifies a broad spectrum of dexterous hand architectures. It comprises a unified parameter space and a canonical URDF format, offering three key advantages. 1) The parameter space captures essential morphological and kinematic variations for effective conditioning in learning algorithms. 2) A structured latent manifold can be learned over our space, where interpolations between embodiments yield smooth and physically meaningful morphology transitions. 3) The canonical URDF standardizes the action space while preserving dynamic and functional properties of the original URDFs, enabling efficient and reliable cross-embodiment policy learning. We validate these advantages through extensive analysis and experiments, including grasp policy replay, VAE latent encoding, and cross-embodiment zero-shot transfer. Specifically, we train a VAE on the unified representation to obtain a compact, semantically rich latent embedding, and develop a grasping policy conditioned on the canonical representation that generalizes across dexterous hands. We demonstrate, through simulation and real-world tasks on unseen morphologies (e.g., 81.9% zero-shot success rate on 3-finger LEAP Hand), that our framework unifies both the representational and action spaces of structurally diverse hands, providing a scalable foundation for cross-hand learning toward universal dexterous manipulation.

View PDFOpen arXiv