MCMC Informed Neural Emulators for Uncertainty Quantification in Dynamical Systems

2026-03-11Machine Learning

Machine Learning
AI summary

The authors address the problem of using neural networks to mimic physical models when it's hard to get good guesses about model parameters. Instead of adjusting the network itself to handle uncertainty, they feed the range of possible model parameters directly into the training using a method called Markov chain Monte Carlo (MCMC). This makes the network faster and still able to show uncertainty like the original model. They show this works with different types of networks and even analyze how differences between assumed and real parameter distributions might affect performance.

neural networksuncertainty quantificationMarkov chain Monte Carlo (MCMC)surrogate modelsautoencoderordinary differential equations (ODE)parameter distributionquantile emulator
Authors
Heikki Haario, Zhi-Song Liu, Martin Simon, Hendrik Weichel
Abstract
Neural networks are a commonly used approach to replace physical models with computationally cheap surrogates. Parametric uncertainty quantification can be included in training, assuming that an accurate prior distribution of the model parameters is available. Here we study the common opposite situation, where direct screening or random sampling of model parameters leads to exhaustive training times and evaluations at unphysical parameter values. Our solution is to decouple uncertainty quantification from network architecture. Instead of sampling network weights, we introduce the model-parameter distribution as an input to network training via Markov chain Monte Carlo (MCMC). In this way, the surrogate achieves the same uncertainty quantification as the underlying physical model, but with substantially reduced computation time. The approach is fully agnostic with respect to the neural network choice. In our examples, we present a quantile emulator for prediction and a novel autoencoder-based ODE network emulator that can flexibly estimate different trajectory paths corresponding to different ODE model parameters. Moreover, we present a mathematical analysis that provides a transparent way to relate potential performance loss to measurable distribution mismatch.