Deterministic Mode Proposals: An Efficient Alternative to Generative Sampling for Ambiguous Segmentation
2026-03-20 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors address tasks where multiple correct ways to segment an image exist, like in medical imaging. Instead of using slow methods that create many random samples and then group them, they propose a method that directly produces a few likely segmentation options quickly in one step. They also use a confidence system from object detection to avoid extra unnecessary guesses. Their method is faster, covers more correct answers, and can work even when the full range of possible outcomes isn't known. Additionally, they show how to estimate the likelihood of these guesses using a pre-trained model.
image segmentationuncertaintygenerative modelsmode proposal modelsconfidence mechanismobject detectioninference timevelocity fieldflow modelground-truth coverage
Authors
Sebastian Gerard, Josephine Sullivan
Abstract
Many segmentation tasks, such as medical image segmentation or future state prediction, are inherently ambiguous, meaning that multiple predictions are equally correct. Current methods typically rely on generative models to capture this uncertainty. However, identifying the underlying modes of the distribution with these methods is computationally expensive, requiring large numbers of samples and post-hoc clustering. In this paper, we shift the focus from stochastic sampling to the direct generation of likely outcomes. We introduce mode proposal models, a deterministic framework that efficiently produces a fixed-size set of proposal masks in a single forward pass. To handle superfluous proposals, we adapt a confidence mechanism, traditionally used in object detection, to the high-dimensional space of segmentation masks. Our approach significantly reduces inference time while achieving higher ground-truth coverage than existing generative models. Furthermore, we demonstrate that our model can be trained without knowing the full distribution of outcomes, making it applicable to real-world datasets. Finally, we show that by decomposing the velocity field of a pre-trained flow model, we can efficiently estimate prior mode probabilities for our proposals.