ForwardFlow: Simulation only statistical inference using deep learning
2026-03-11 • Machine Learning
Machine LearningNeural and Evolutionary Computing
AI summaryⓘ
The authors explore a new deep learning method to estimate parameters in frequentist statistical models using simulations. They design a special neural network that summarizes data and predicts parameters by minimizing prediction errors. Their model shows three benefits: accurate estimates even with small samples, resistance to noisy data, and the ability to approximate complex algorithms automatically. They demonstrate this by teaching the network to mimic an EM-algorithm for genetic data. The approach allows researchers to focus on simulating data while the neural network handles the harder task of estimating parameters.
deep learningfrequentist modelsparameter estimationsummary statisticsnormalizing flowsinverse problemmean-square errorEM-algorithmgenetic datasimulation-only frameworks
Authors
Stefan Böhringer
Abstract
Deep learning models are being used for the analysis of parametric statistical models based on simulation-only frameworks. Bayesian models using normalizing flows simulate data from a prior distribution and are composed of two deep neural networks: a summary network that learns a sufficient statistic for the parameter and a normalizing flow that conditional on the summary network can approximate the posterior distribution. Here, we explore frequentist models that are based on a single summary network. During training, input of the network is a simulated data set based on a parameter and the loss function minimizes the mean-square error between learned summary and parameter. The network thereby solves the inverse problem of parameter estimation. We propose a branched network structure that contains collapsing layers that reduce a data set to summary statistics that are further mapped through fully connected layers to approximate the parameter estimate. We motivate our choice of network structure by theoretical considerations. In simulations we demonstrate three desirable properties of parameter estimates: finite sample exactness, robustness to data contamination, and algorithm approximation. These properties are achieved offering the the network varying sample size, contaminated data, and data needing algorithmic reconstruction during the training phase. In our simulations an EM-algorithm for genetic data is automatically approximated by the network. Simulation only approaches seem to offer practical advantages in complex modeling tasks where the simpler data simulation part is left to the researcher and the more complex problem of solving the inverse problem is left to the neural network. Challenging future work includes offering pre-trained models that can be used in a wide variety of applications.