Affordance-Based Hierarchical Reinforcement Learning for Quadruped Pedipulation
2026-06-05 • Robotics
Robotics
AI summaryⓘ
The authors address the challenge of quadruped robots manipulating objects without following preset paths. They created a three-level learning system that helps the robot pick good spots to interact with objects and positions itself properly. This system uses clues called affordances to guide both moving and manipulating tasks. They trained and tested their method in simulations and real-life experiments, showing the robot can perform object manipulation autonomously. Their work removes the need for human input during these tasks.
quadruped robotsreinforcement learningpose affordancelocomotion policypedipulationIsaacSimobject manipulationhierarchical learningend-effectorrobot navigation
Authors
Tuba Girgin, Jose Castelblanco, Gabriel Rodriguez, Emre Girgin, Cagri Kilic
Abstract
The object manipulation capabilities of quadruped robots is an open research challenge. While previous studies have focused on low-level policy learning, task execution still relies on expert-designed high-level trajectories. Autonomous selection of both an affordable interaction point on the target object and an affordable robot base pose removes the need for pre-designed trajectories. This study proposes a three-level hierarchical reinforcement learning (RL) framework that utilizes pose affordances to guide the navigation policy, while the navigation policy drives the locomotion policy. In addition, the pedipulation policy is guided by interaction-point affordances, enabling object-centric pose alignment of the quadruped robot and effective end-effector manipulation planning. We train the proposed framework in the IsaacSim ecosystem and evaluate it in both simulation and real-world settings. We investigate the effectiveness of pose affordance across multiple scenarios in simulation while various object interaction tasks are validated on real-world setting forming an object-interaction dataset. The results show that the proposed framework can autonomously identify candidate poses based on their affordance and successfully execute object manipulation tasks in the real world without human guidance.