COLM 2026 Workshop

Non-Autoregressive Language Models for Fast & Flexible Text Generation

A workshop on language generation beyond next-token prediction — spanning diffusion, flow matching, and any-order autoregression.

October 9, 2026 San Francisco, CA · co-located with COLM 2026
Submissions Due
Jun 23, 2026
via OpenReview · AoE
Notifications
Jul 24, 2026
accept / reject
Camera-ready
TBA
non-archival
Workshop
Oct 9, 2026
San Francisco
News

Announcements

  • Workshop website is live. Call for papers details are posted below; the OpenReview submission portal will be linked here once it opens.
  • NonAR-LM is confirmed as an official workshop at COLM 2026 in San Francisco.
About

Language generation beyond next-token prediction

Autoregressive next-token prediction has long been the dominant paradigm for language modeling, thanks to its simplicity, scalability, and strong empirical performance. Yet the left-to-right factorization imposes constraints that limit efficiency, controllability, and global coherence.

Recent advances in non-autoregressive modeling offer a fundamentally different approach to discrete sequence generation. Instead of committing to a fixed left-to-right order, these models enable parallel decoding, generate tokens in any order, and can revise earlier decisions. They span masked and uniform-state diffusion, discrete flow matching, and any-order autoregression, and are now competitive at scale and increasingly deployed in industry systems. This workshop brings the community together around three core challenges:

Sequential decoding bottleneck

Tokens are generated one at a time, preventing parallelism across the sequence and leaving hardware underutilized.

Limited controllability

Conditioning on global constraints or future tokens is indirect, often needing complex prompting, rejection sampling, or constrained decoding.

Limited global consistency

Local, token-level decisions can drift into incoherence over long horizons, since the model cannot revise earlier outputs.

Topics

Call for contributed work

We invite contributions on training and/or inference of non-autoregressive language models — including diffusion, flow matching, and any-order autoregression.

01

Modeling & Training

New model classes and training objectives — discrete diffusion, uniform-state diffusion, flow-based, and any-order approaches.

02

Inference & Sampling

Inference-time algorithms: iterative refinement, parallel decoding, controllable and constrained generation, planning and correction.

03

Evaluation & Efficiency

Evaluation beyond left-to-right likelihood, plus parallel generation, latency-constrained inference, and systems for scaling.

04

Applications

Applications across language, code, and biological sequences — including comparative studies of when iterative models help.

Call for Papers

Submit your work

Submissions may present new results, works in progress, negative results, empirical evaluations, or forward-looking position papers relevant to the workshop themes.

  • Up to 8 pages, excluding references and an optional appendix; shorter submissions are equally welcome.
  • Non-archival. Submitting does not preclude publishing elsewhere.
  • Double-blind review. Each submission receives at least three reviews.
  • Six spotlight (contributed) talks selected from submissions; all accepted work is presented as posters.
  • Submissions due June 23, 2026; notifications by July 24, 2026.
Program

Schedule

A full day of 6 invited talks, 6 contributed spotlight talks, 2 poster sessions, and a panel discussion (times in Pacific Time).

09:00
Opening Remarks
09:10
Invited Talk 1
Keynote · 40 min
09:50
Invited Talk 2
Keynote · 40 min
10:30
Coffee & Poster Session 1
11:30
Invited Talk 3
Keynote · 40 min
12:10
Contributed Talks I
3 spotlight talks · 10 min each
12:40
Lunch
13:50
Invited Talk 4
Keynote · 40 min
14:30
Invited Talk 5
Keynote · 40 min
15:10
Contributed Talks II
3 spotlight talks · 10 min each
15:40
Coffee & Poster Session 2
16:40
Invited Talk 6
Keynote · 40 min
17:20
Panel Discussion
The future of language generation beyond next-token prediction · 60 min
18:20
Closing Remarks

Talk-to-slot assignments will be finalized closer to the event.

Invited Speakers

Speakers

All invited speakers and panelists have confirmed

Stefano Ermon
Associate Professor, Stanford
Shansan Gong
Ph.D. Candidate, University of Hong Kong
Aditya Grover
Assistant Professor, UCLA
Jiaxin Shi
Research Scientist, Meta SuperIntelligence Labs
Arash Vahdat
Research Director, NVIDIA Research
Mengdi Wang
Professor, Princeton University
Panel

The future of language generation

Our panel examines when iterative discrete generation offers qualitatively different capabilities from standard autoregressive modeling, and what technical barriers remain in scaling, controllability, inference-time computation, and evaluation — bringing together complementary viewpoints from academia and industry.

Fred Peng
Duke University
Moderator
Aditya Grover
UCLA
Panelist
Subham Sekhar Sahoo
MBZUAI – IFM
Panelist
Aaron Lou
OpenAI
Panelist
Organizers

Organizing committee

Contact: Fred Peng (zhangzhi.peng@duke.edu).

Junior Organizers

Fred Peng
Ph.D. Candidate, Duke University
Marianne Arriola
Ph.D. Candidate, Cornell University
Jaeyeon Kim
Ph.D. Student, Harvard University
Siyan Zhao
Ph.D. Candidate, UCLA

Senior Organizers

Alexander Tong
Principal Investigator, Aithyra
Arnaud Doucet
Senior Staff Research Scientist, Google DeepMind · Visiting Professor, Univ. of Oxford
Brendan O’Donoghue
Director of Research, Google DeepMind
Yuxuan Song
Research Scientist, ByteDance Seed

Inclusion

Our speakers, panelists, and organizers span career stages, institution types, and geographies across North America, Europe, the Middle East, and Asia, bridging academia and industry.

We support broad participation through an open, non-archival call, poster presentations for all accepted papers, and spotlight talks for selected submissions.