Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators

2026-05-05Computation and Language

Computation and LanguageArtificial Intelligence
AI summary

The authors studied how well AI text detectors work when they see different types of AI-generated text than they were trained on. They trained detectors using a dataset called HC3 PLUS and set one decision threshold to use for all tests, revealing that performance drops a lot when testing on new data types. They improved results by adding linguistic features and using a modern model called DeBERTa, which worked better than older models and was more stable across data shifts. Their findings highlight that careful feature design and evaluation methods are important for creating robust text detectors.

AI-generated text detectiondistribution shifttransformer modelsdecision threshold calibrationDeBERTafeature augmentationbalanced accuracycross-dataset evaluationlinguistic featuresrobustness
Authors
Mohamed Mady, Johannes Reschke, Björn Schuller
Abstract
AI-generated text is nowadays produced at scale across domains and heterogeneous generation pipelines, making robustness to distribution shift a central requirement for supervised binary detectors. We train transformer-based detectors on HC3 PLUS and calibrate a single decision threshold by maximising balanced accuracy on held-out validation; this threshold is then kept fixed for all downstream test distributions, revealing domain- and generator-dependent error asymmetries under shift. We evaluate in-domain on HC3 PLUS, under cross-dataset transfer to the multi-domain, multi-generator M4 benchmark, and on the external AI-Text-Detection-Pile. Although base models achieve near-ceiling in-domain performance (up to 99.5% balanced accuracy), performance under shift is brittle and strongly model-dependent. Feature augmentation via attention-based linguistic feature fusion improves transfer, with our best model (DeBERTa-v3-base+FeatAttn) achieving 85.9% balanced accuracy on M4. Multi-seed experiments confirm high stability. Under the same fixed-threshold protocol, our model outperforms strong zero-shot baselines by up to +7.22 points. Category-level ablations further show that readability and vocabulary features contribute most to robustness under shift. Overall, these results demonstrate that feature augmentation and a modern DeBERTa backbone significantly outperform earlier BERT/RoBERTa models, while the fixed-threshold protocol provides a more realistic and informative assessment of practical detector robustness.