A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models
2026-06-02 • Computation and Language
Computation and LanguageArtificial Intelligence
AI summaryⓘ
The authors studied what language features can help tell if a text was written by AI or a human. They tested 284 different features on texts from 27 AI models across various topics to see which features work reliably. They found that using just these language features, it's possible to identify AI-generated text, but many features depend a lot on the model or topic. The one type of feature that worked well everywhere was "lexical richness," which relates to the variety of words used. This work helps show which language clues are most useful for spotting AI writing in different situations.
linguistic featureslexical richnesslarge language models (LLMs)AI-generated textcross-domain generalizationtext classificationinterpretabilitymachine-generated textempirical studynatural language processing
Authors
Yassir El Attar, Esra Dönmez, Maximilian Maurer, Agnieszka Falenska
Abstract
Interpretable linguistic features offer a promising approach for explaining why a given text appears machine-generated, particularly for non-expert users. However, existing findings on which features reliably indicate LLM-generated text remain fragmented across feature sets, models, and text domains. To address this gap, we conduct a large-scale empirical study assessing the robustness of linguistic signals for characterizing AI-generated text. Our analysis covers 284 interpretable linguistic features across outputs from 27 LLMs and ten text domains under cross-model and cross-domain generalization settings. We show that classifiers based solely on linguistic features can reliably distinguish AI-generated from human-written text. However, many previously proposed indicators prove strongly context-dependent, with the exception of measures of lexical richness, which remain robust signals across model families and text domains. These results demonstrate which linguistic signals generalize across contexts and provide a foundation for more reliable, interpretable analyses of AI-generated language.