ExplorerData ScienceMachine Learning
Research PaperResearchia:202604.15051

LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling

Yuxin Chen

Abstract

Continuous diffusion models have achieved strong performance across domains such as images. However, in language modeling, prior continuous diffusion language models (DLMs) lag behind discrete counterparts. In this work, we close this gap with LangFlow, the first continuous DLM to rival discrete diffusion. Our approach connects embedding-space DLMs to Flow Matching via Bregman divergence and introduces three key innovations: (1) a novel ODE-based NLL bound for principled evaluation of continuous...

Submitted: April 15, 2026Subjects: Machine Learning; Data Science

Description / Details

Continuous diffusion models have achieved strong performance across domains such as images. However, in language modeling, prior continuous diffusion language models (DLMs) lag behind discrete counterparts. In this work, we close this gap with LangFlow, the first continuous DLM to rival discrete diffusion. Our approach connects embedding-space DLMs to Flow Matching via Bregman divergence and introduces three key innovations: (1) a novel ODE-based NLL bound for principled evaluation of continuous flow-based language models; (2) an information-uniform principle for noise scheduling, motivating a learnable scheduler based on a Gumbel distribution; and (3) an improved training protocol incorporating self-conditioning, which enhances both likelihood and sample quality.LangFlow achieves strong performance across benchmarks, reaching a perplexity (PPL) of 30.0 on LM1B and 24.6 on OpenWebText. It matches top discrete DLMs at comparable scale and surpasses autoregressive baselines in zero-shot transfer across multiple benchmarks. LangFlow provides clear evidence that continuous diffusion is a competitive and promising paradigm for language modeling. https://github.com/nealchen2003/LangFlow


Source: arXiv:2604.11748v1 - http://arxiv.org/abs/2604.11748v1 PDF: https://arxiv.org/pdf/2604.11748v1 Original Link: http://arxiv.org/abs/2604.11748v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 15, 2026
Topic:
Data Science
Area:
Machine Learning
Comments:
0
Bookmark
LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling | Researchia