Towards Linguistically-informed Representations for English as a Second or Foreign Language: Review, Construction and Application
2026-04-10 • Computation and Language
Computation and LanguageArtificial Intelligence
AI summaryⓘ
The authors explain that English used as a second or foreign language (ESFL) is its own distinct system, not just imperfect standard English. They review current resources for studying ESFL and find them lacking, so they create a new detailed database that links sentence structure with meaning for ESFL and standard English. This database includes over 1600 annotated ESFL sentences and is based on theories that treat language as learned constructions. They also show how their work can help study how people learn languages by testing a specific linguistic idea.
English as a Second Language (ESL)syntax-semantics interfaceconstructivist linguisticsannotated corpusSyntactico-semantic resourceLinguistic Niche HypothesisSecond Language Acquisitionconstructionslanguage corpusgold-standard dataset
Authors
Wenxi Li, Xihao Wang, Weiwei Sun
Abstract
The widespread use of English as a Second or Foreign Language (ESFL) has sparked a paradigm shift: ESFL is not seen merely as a deviation from standard English but as a distinct linguistic system in its own right. This shift highlights the need for dedicated, knowledge-intensive representations of ESFL. In response, this paper surveys existing ESFL resources, identifies their limitations, and proposes a novel solution. Grounded in constructivist theories, the paper treats constructions as the fundamental units of analysis, allowing it to model the syntax--semantics interface of both ESFL and standard English. This design captures a wide range of ESFL phenomena by referring to syntactico-semantic mappings of English while preserving ESFL's unique characteristics, resulting a gold-standard syntactico-semantic resource comprising 1643 annotated ESFL sentences. To demonstrate the sembank's practical utility, we conduct a pilot study testing the Linguistic Niche Hypothesis, highlighting its potential as a valuable tool in Second Language Acquisition research.