ELF: Embedded Language Flows
2026-05-11 • Computation and Language
Computation and LanguageArtificial IntelligenceMachine Learning
AI summaryⓘ
The authors studied how to generate language using models that usually work with continuous data, like images, but applied to words. They created a method called Embedded Language Flows (ELF), which works mostly with continuous word embeddings before converting to actual words at the end. This approach allows them to use techniques from image generation and leads to better language generation with fewer steps compared to other methods. Their experiments show that ELF can be a more effective way to build continuous diffusion language models.
diffusion modelsflow matchingcontinuous embeddingslanguage modelingdiscrete tokensclassifier-free guidancegeneration qualitysampling steps
Authors
Keya Hu, Linlu Qiu, Yiyang Lu, Hanhong Zhao, Tianhong Li, Yoon Kim, Jacob Andreas, Kaiming He
Abstract
Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling. Unlike their image-domain counterparts, today's leading diffusion language models (DLMs) primarily operate over discrete tokens. In this paper, we show that continuous DLMs can be made effective with minimal adaptation to the discrete domain. We propose Embedded Language Flows (ELF), a class of diffusion models in continuous embedding space based on continuous-time Flow Matching. Unlike existing DLMs, ELF predominantly stays within the continuous embedding space until the final time step, where it maps to discrete tokens using a shared-weight network. This formulation makes it straightforward to adapt established techniques from image-domain diffusion models, e.g., classifier-free guidance (CFG). Experiments show that ELF substantially outperforms leading discrete and continuous DLMs, achieving better generation quality with fewer sampling steps. These results suggest that ELF offers a promising path toward effective continuous DLMs.