MixFlow: Mixed Source Distributions Improve Rectified Flows
2026-04-10 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionMachine Learning
AI summaryⓘ
The authors address a problem in diffusion models, which are methods to generate images but tend to be slow because they follow complex paths. They find that part of the problem is using a fixed starting point (a standard Gaussian) that doesn't match well with real data. To fix this, they introduce a new way to adjust the starting distribution based on some signal, called κ-FC, and a training method called MixFlow. MixFlow blends the adjusted and fixed distributions during training, which makes the generation process smoother and faster, leading to better image quality with fewer steps.
Diffusion modelsRectified flowsGenerative pathsSource distributionStandard Gaussianκ-FCMixFlowFlow modelFID (Fréchet Inception Distance)Sampling efficiency
Authors
Nazir Nayal, Christopher Wewer, Jan Eric Lenssen
Abstract
Diffusion models and their variations, such as rectified flows, generate diverse and high-quality images, but they are still hindered by slow iterative sampling caused by the highly curved generative paths they learn. An important cause of high curvature, as shown by previous work, is independence between the source distribution (standard Gaussian) and the data distribution. In this work, we tackle this limitation by two complementary contributions. First, we attempt to break away from the standard Gaussian assumption by introducing $κ\texttt{-FC}$, a general formulation that conditions the source distribution on an arbitrary signal $κ$ that aligns it better with the data distribution. Then, we present MixFlow, a simple but effective training strategy that reduces the generative path curvatures and considerably improves sampling efficiency. MixFlow trains a flow model on linear mixtures of a fixed unconditional distribution and a $κ\texttt{-FC}$-based distribution. This simple mixture improves the alignment between the source and data, provides better generation quality with less required sampling steps, and accelerates the training convergence considerably. On average, our training procedure improves the generation quality by 12\% in FID compared to standard rectified flow and 7\% compared to previous baselines under a fixed sampling budget. Code available at: $\href{https://github.com/NazirNayal8/MixFlow}{https://github.com/NazirNayal8/MixFlow}$