Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs
2026-05-12 • Artificial Intelligence
Artificial IntelligenceComputers and SocietyComputer Science and Game TheoryMachine Learning
AI summaryⓘ
The authors created DR-Gym, a simulation tool to help electric utilities practice managing demand-response programs, which encourage consumers to use less electricity during costly or extreme weather times. Unlike other tools focused on individual devices, DR-Gym looks at the broader market and uses real data on pricing and building energy use to make the simulation realistic. They built this tool so AI methods can learn to balance reducing costs and maintaining reliable electricity. Their tests show DR-Gym can simulate real situations and help develop smarter energy strategies.
demand responsereinforcement learningoffline dataelectric utilitywholesale electricity marketsGymnasium environmentsimulationregime-switching modelsmart metersenergy pricing
Authors
Jose E. Aguilar Escamilla, Lingdong Zhou, Xiangqi Zhu, Huazheng Wang
Abstract
Extreme weather and volatile wholesale electricity markets expose residential consumers to catastrophic financial risks, yet demand response at the distribution level remains an underutilized tool for grid flexibility and energy affordability. While a demand-response program can shield consumers by issuing financial credits during high-price periods, optimizing this sequential decision-making process presents a unique challenge for reinforcement learning despite the plentiful offline historical smart meter and wholesale pricing data available publicly. Offline historical data fails to capture the dynamic, interactive feedback loop between an electric utility's pricing signals and customer acceptance and adaptation to a demand-response program. To address this, we introduce DR-Gym, an open-source, online Gymnasium-compatible environment designed to train and evaluate demand-response from the electric utility's perspective. Unlike existing device-level energy simulators, our environment focuses on the market-level electric utility setting and provides a rich observational space relevant to the electric utility. The simulator additionally features a regime-switching wholesale price model calibrated to real-world extreme events, alongside physics-based building demand profiles. For our learning signal, we use a configurable, multi-objective reward function for specifying diverse learning objectives. We demonstrate through baseline strategies and data snapshots the capability of our simulator to create realistic and learnable environments.