Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

2026-02-23 • Artificial Intelligence

Artificial Intelligence

AI summaryⓘ

The authors work on Mean Field Games, a way to study many interacting agents where individual randomness evens out except for common shocks. They note previous methods either struggle with high variance or don’t scale well, especially when agents can’t fully observe everything. They introduce a new method called Recurrent Structural Policy Gradient (RSPG) that can handle settings where agents remember past public information. Using this method and their new tool MFAX, they achieve faster learning and successfully solve a complicated economic model with different types of agents and shared uncertain events.

Mean Field GamesCommon NoisePartially Observable SettingsMonte Carlo RolloutsPolicy GradientHeterogeneous AgentsTransition DynamicsMacroeconomicsJAX FrameworkState Convergence

Authors

Clarisse Wibault, Johannes Forkel, Sebastian Towers, Tiphaine Wibault, Juan Duque, George Whittle, Andreas Schaab, Yucheng Yang, Chiyuan Wang, Michael Osborne, Benjamin Moll, Jakob Foerster

Abstract

Mean Field Games (MFGs) provide a principled framework for modeling interactions in large population models: at scale, population dynamics become deterministic, with uncertainty entering only through aggregate shocks, or common noise. However, algorithmic progress has been limited since model-free methods are too high variance and exact methods scale poorly. Recent Hybrid Structural Methods (HSMs) use Monte Carlo rollouts for the common noise in combination with exact estimation of the expected return, conditioned on those samples. However, HSMs have not been scaled to Partially Observable settings. We propose Recurrent Structural Policy Gradient (RSPG), the first history-aware HSM for settings involving public information. We also introduce MFAX, our JAX-based framework for MFGs. By leveraging known transition dynamics, RSPG achieves state-of-the-art performance as well as an order-of-magnitude faster convergence and solves, for the first time, a macroeconomics MFG with heterogeneous agents, common noise and history-aware policies. MFAX is publicly available at: https://github.com/CWibault/mfax.

View PDFOpen arXiv