BabyLM Turns 4: Call for Papers for the 2026 BabyLM Workshop

2026-02-23Computation and Language

Computation and Language
AI summary

The authors describe BabyLM, a project that tries to connect how humans think (cognitive modeling) with how computers understand language (language modeling). They invite researchers to join their 4th competition focused on training language models efficiently using limited data. This year, they added a new challenge for models that work with multiple languages. They also welcome papers on related topics like making training more efficient and evaluating models realistically.

BabyLMcognitive modelinglanguage modelingpretrainingdata efficiencymultilingual modelsmodel evaluationtraining efficiency
Authors
Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Jaap Jumelet, Tal Linzen, Aaron Mueller, Suchir Salhan, Raj Sanjay Shah, Alex Warstadt, Ethan Gotlieb Wilcox
Abstract
BabyLM aims to dissolve the boundaries between cognitive modeling and language modeling. We call for both workshop papers and for researchers to join the 4th BabyLM competition. As in previous years, we call for participants in the data-efficient pretraining challenge in the general track. This year, we also offer a new track: Multilingual. We also call for papers outside the competition in any relevant areas. These include training efficiency, cognitively plausible research, weak model evaluation, and more.