Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

2026-05-29Computation and Language

Computation and LanguageArtificial Intelligence
AI summary

The authors studied how well different language models understand rare phrase patterns in English called Paired-Focus constructions, like "let alone" and "much less." They created a new test to check if models grasp both the form and meaning of these phrases using adjective comparisons and general knowledge. They found that some smaller open-source models can understand these constructions, but models trained only on human-size data struggle. They also observed that semantic understanding of these phrases comes later during training and links to learning other world knowledge. This shows smaller models can learn complex language patterns if trained appropriately.

Paired-Focus constructionsLarge language models (LLMs)SemanticsScalar adjectival semanticsOpen-source modelsTraining dynamicsForm-meaning pairingWorld knowledge
Authors
Wesley Scivetti, Ethan Wilcox, Nathan Schneider, Kanishka Misra, Leonie Weissweiler
Abstract
Grasping the semantics of rare constructions (form-meaning pairings) has been shown to be a challenging problem that has currently only been solved by the largest LLMs. It remains an open question if open-source models have robust constructional understanding, and if so, what learning dynamics underlie the acquisition of this knowledge. Focusing on a set of rare Paired-Focus constructions in English (e.g. "let alone", "much less"), we construct a novel dataset to test their meanings using both scalar adjectival semantics and general world knowledge. Testing a wide range of models differing in parameter count, architecture, and pretraining dataset size, we find that several modestly sized models are sensitive to both the forms and the meanings of Paired-Focus constructions, though models trained on human-scale data fail at all meaning evaluations. Turning to training dynamics for a set of open-checkpoint models, we find that Paired-Focus understanding emerges later in training than Paired-Focus syntactic knowledge, and that learning of Paired-Focus semantics is correlated with gains in some domains of world knowledge. Overall, our empirical results support the conclusion that modestly sized open-source models can grasp the rare Paired-Focus constructions, and demonstrate a connection between knowledge of Paired-Focus constructions and other meaning domains.