Neural Galerkin Normalizing Flows for Bayesian Inference of Diffusions with Inaccessible Boundaries
2026-06-03 • Machine Learning
Machine Learning
AI summaryⓘ
The authors address a problem in Bayesian inference for diffusion models where the key step requires knowing how the system moves between observations, but this movement is hard to describe exactly. They propose a new machine learning method that uses Normalizing Flows combined with a Neural Galerkin approach to learn this movement by solving an associated equation called the Fokker-Planck equation. This method focuses on cases where the diffusion behaves specially near boundaries, like in some financial models. Once trained, their approach makes Bayesian inference much faster by avoiding repeated complex calculations during sampling.
Bayesian inferencediffusion modeltransition densityFokker-Planck equationNormalizing FlowsNeural Galerkin methodStochastic Volatility modelsFeller conditionMarkov chain Monte Carlolikelihood function
Authors
Riccardo Saporiti, Fabio Nobile
Abstract
One of the primary challenges in Bayesian inference on the parameters of a diffusion model from discrete observations is the unavailability of an analytical expression for the transition density function between consecutive observation times, which is needed to derive the likelihood function. Extending previous studies that solve Fokker-Planck (FP) type partial differential equations with Normalizing Flows, we propose a new Normalizing Flow architecture to learn the transition density function of the diffusion process between two observation times. We do so by solving in a Neural Galerkin framework the associated FP equation with a Dirac mass as initial condition, over a specified training distribution of the initial datum and the coefficients of the diffusion. We specifically focus on processes whose diffusion matrix vanishes in certain inaccessible boundary regions, such as Stochastic Volatility models that satisfy a Feller condition. The product of the obtained transition densities evaluated along the observed trajectory approximates the likelihood function, thereby enabling cheap posterior sampling via Markov chain Monte Carlo (MCMC). After the offline training phase, inference becomes significantly more efficient, as it avoids the need to solve the FP equation in real time for each parameter proposed by the MCMC sampler or to rely on other likelihood-free methods for Bayesian inference that involve repeated simulation of diffusion bridges.