When, why, and how do diffusion posterior samplers fail? A finite-sample lens

2026-05-28Machine Learning

Machine Learning
AI summary

The authors studied how diffusion models are used to estimate uncertain images from measurements and identified problems with current methods that approximate likelihoods during the sampling process. They showed that these approximations can make the estimated distributions either too narrow or too wide, leading to errors like misplaced image features or sensitivity to when sampling stops. Interestingly, these errors can happen even with simple measurement setups if the prior information is complex. Their approach provides a way to better understand and diagnose such errors regardless of the model type or likelihood approximation used.

Diffusion modelsPosterior samplingInverse problemsLikelihood approximationFinite-sample analysisMultimodal distributionsSampling errorForward modelEarly stoppingBayesian inference
Authors
Benjamin A. Burns, Sara Fridovich-Keil
Abstract
Diffusion models have excellent capacity to model complex distributions of natural data, which has made them a popular and effective choice for posterior sampling in imaging inverse problems. Existing methods can incorporate any measurement model at inference time but must use an inexact approximation for the likelihood at intermediate timesteps for computational tractability. Although these approximations can often work well empirically, their downstream effect on the sampled posterior is poorly understood and can result in unexplained failures. To understand when, why, and how these likelihood approximations propagate to erroneous posterior distributions, we introduce a finite-sample perspective on posterior sampling that approximates the posterior to arbitrary precision as training set size tends towards infinity, for any forward model and prior distribution. Using this finite-sample lens, we observe that popular posterior sampling approximations tend to under- or over-estimate the spread of the posterior at intermediate timesteps, causing downstream consequences including sensitivity to early stopping time, inaccurate relative weighting of posterior modes, and hallucination, both of prior modes that are not in the posterior and likelihood modes that are not supported by the prior. Moreover, we find that the cause of these posterior errors requires neither a nonlinear measurement model nor a multimodal posterior, but can arise solely due to a multimodal prior and inaccurate posterior spread at intermediate sampling times. Our finite-sample posterior sampling approach is agnostic to the type of likelihood approximation and the type of (linear or nonlinear) forward model, and can thus serve as a drop-in diagnostic to evaluate the accuracy and failure modes of existing and future posterior samplers.