Model-Based Reinforcement Learning for Control under Time-Varying Dynamics

2026-04-02 • Machine Learning

Machine LearningRobotics

AI summaryⓘ

The authors study how to teach a computer to control machines whose behavior changes over time, which is common in real life due to things like wear and changing conditions. They focus on reinforcement learning methods that update a model of the system as it changes across different episodes. Their analysis shows that old data can mislead the learning process, so it's important to carefully choose which data to use. Based on this, they propose an improved learning algorithm that adapts its data usage, and they test it on tasks where system behavior changes continuously, showing better results.

reinforcement learningnon-stationary dynamicsmodel-based controlGaussian processesdynamic regretadaptive data bufferingcontinual learningtransition dynamicscontrol benchmarks

Authors

Klemens Iten, Bruce Lee, Chenhao Li, Lenart Treven, Andreas Krause, Bhavya Sukhija

Abstract

Learning-based control methods typically assume stationary system dynamics, an assumption often violated in real-world systems due to drift, wear, or changing operating conditions. We study reinforcement learning for control under time-varying dynamics. We consider a continual model-based reinforcement learning setting in which an agent repeatedly learns and controls a dynamical system whose transition dynamics evolve across episodes. We analyze the problem using Gaussian process dynamics models under frequentist variation-budget assumptions. Our analysis shows that persistent non-stationarity requires explicitly limiting the influence of outdated data to maintain calibrated uncertainty and meaningful dynamic regret guarantees. Motivated by these insights, we propose a practical optimistic model-based reinforcement learning algorithm with adaptive data buffer mechanisms and demonstrate improved performance on continuous control benchmarks with non-stationary dynamics.

View PDFOpen arXiv