Mean Flow Distillation: Robust and Stable Distillation for Flow Matching Models
2026-06-09 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors address the problem that flow matching models, which generate data by simulating continuous changes, are slow because they need many steps to create samples. They propose Mean Flow Distillation (MFD), a new way to simplify these models by averaging flow velocities, which reduces noise and keeps the overall structure intact. Their theory and experiments show that MFD can produce high-quality results much faster, including in tasks like predicting 4D shapes and generating images from text. This makes flow matching models more practical for real-time use.
Flow MatchingODE SamplingDistillationVariational Score DistillationTemporal Low-Pass FilterTrajectory ConsistencyHigh-Dimensional Manifolds4D Occupancy ForecastingText-to-Image GenerationDistribution Alignment
Authors
An Zhao, Shengyuan Zhang, Zhongjian Sun, Yixiang Zhou, Zejian Li, Ling Yang, Tianrun Chen, Lingyun Sun
Abstract
Flow Matching models have demonstrated strong performance across a wide range of generative tasks. However, their reliance on ODE-based iterative sampling incurs substantial computational overhead in inference, which limits their applicability in real-time scenes. While distillation is a promising solution, existing approaches largely borrow from diffusion-based score matching, often failing to exploit the intrinsic geometric structure of flows and suffering from training instability, high variance, and degraded generation quality. In this paper, we propose Mean Flow Distillation (MFD), a novel distillation framework tailored for flow matching models. We theoretically demonstrate that MFD acts as a temporal low-pass filter, effectively suppressing the high-frequency optimization noise inherent in variational score distillation (VSD) while ensuring global trajectory consistency. We further prove the Mean Flow Matching Theorem, establishing that matching expected average velocities is sufficient for strict distribution alignment. Empirically, on challenging tasks of high-dimensional manifolds including 4D occupancy forecasting and text-to-image generation, MFD achieves state-of-the-art performance, enabling high-fidelity single-step generation.