Flow-based Policy Adaptation without Policy Updates

2026-06-04Robotics

Robotics
AI summary

The authors propose GLOVES, a method that helps robots improve their actions by adjusting less-than-perfect moves towards those an expert would take. Instead of taking full control away from the robot, their method selectively corrects actions only when they look unusual or wrong, keeping the robot's original intent intact. GLOVES learns from a small amount of expert data and combines expert action patterns during use, making it a lightweight way to improve robot skill without needing lots of training. This method also decides when to intervene, providing help only when necessary.

pretrained policiesfoundation modelsflow-based adaptationexpert action distributionin-distribution scoringout-of-distribution (OOD)shared controlrobot skill learningaction correctionselective intervention
Authors
Luzhe Sun, Jingtian Ji, Haoran Chen, Jiawei Zhou, Matthew R. Walter
Abstract
Leveraging prior knowledge from pretrained policies, foundation models, or human operators offers an efficient alternative to learning robot skills from scratch. However, these agents often provide actions that are suboptimal, noisy, or misaligned with task-specific expert behavior. We propose GLOVES, a family of flow-based adaptation methods that correct non-expert actions by transporting them toward an expert action distribution. Rather than replacing agentic control with full autonomy, GLOVES performs selective action-level adaptation, improving task success while preserving agent intent. The learned flow also provides a natural in-distribution scoring mechanism through reverse flow evaluation. We use this signal as an intervention gate: actions that appear consistent with the expert distribution are passed through unchanged, while anomalous or out-of-distribution (OOD) actions are corrected. In this way, assistance is only provided when necessary. GLOVES requires only limited expert supervision, using a small number of demonstrations or reusable successful skill segments. By learning local expert action patterns and stitching them during execution, GLOVES provides a lightweight shared-control module for robust action adaptation across tasks and environments. Code and demos are available at ripl.github.io/GLOVES_web.