Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

2026-04-28 • Computation and Language

Computation and LanguageArtificial IntelligenceComputers and Society

AI summaryⓘ

The authors suggest that detecting machine-generated text should focus on common structural weaknesses rather than specific model traits. They found that AI-written text, while good locally, breaks down more when its word order is shuffled, unlike human text. They created Luminol-AIDetect, which uses this effect by measuring changes in a text's predictability after shuffling to tell AI text apart from human writing. Tested on many languages and scenarios, their method works well and is more efficient than earlier techniques.

machine-generated text detectionlarge language modelsperplexitytext coherencerandomized text shufflingautoregressive modelszero-shot detectiondensity estimationfalse positive rate

Authors

Lucio La Cava, Andrea Tagarelli

Abstract

Machine-generated text (MGT) detection requires identifying structurally invariant signals across generation models, rather than relying on model-specific fingerprints. In this respect, we hypothesize that while large language models excel at local semantic consistency, their autoregressive nature results in a specific kind of structural fragility compared to human writing. We propose Luminol-AIDetect, a novel, zero-shot statistical approach that exposes this fragility through coherence disruption. By applying a simple randomized text-shuffling procedure, we demonstrate that the resulting shift in perplexity serves as a principled, model-agnostic discriminant, as MGT displays a characteristic dispersion in perplexity-under-shuffling that differs markedly from the more stable structural variability of human-written text. Luminol-AIDetect leverages this distinction to inform its decision process, where a handful of perplexity-based scalar features are extracted from an input text and its shuffled version, then detection is performed via density estimation and ensemble-based prediction. Evaluated across 8 content domains, 11 adversarial attack types, and 18 languages, Luminol-AIDetect demonstrates state-of-the-art performance, with gains up to 17x lower FPR while being cheaper than prior methods.

View PDFOpen arXiv