Towards Anytime-Valid Statistical Watermarking

2026-02-19 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors address the challenge of identifying text generated by large language models quickly and accurately. They point out problems with current watermarking methods, which can't stop early without losing reliability. To fix this, they created a new approach called Anchored E-Watermarking that allows for flexible, anytime testing while maintaining error guarantees. Their method also optimizes how sampling is done to detect machine-generated text more efficiently. Tests show that their approach needs fewer tokens to make confident detections compared to existing methods.

Large Language ModelsWatermarkingE-valuesHypothesis TestingSupermartingaleSampling DistributionOptional StoppingType-I ErrorToken Budget

Authors

Baihe Huang, Eric Xu, Kannan Ramchandran, Jiantao Jiao, Michael I. Jordan

Abstract

The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize the optimal e-value with respect to the worst-case log-growth rate and derive the optimal expected stopping time. Our theoretical claims are substantiated by simulations and evaluations on established benchmarks, showing that our framework can significantly enhance sample efficiency, reducing the average token budget required for detection by 13-15% relative to state-of-the-art baselines.

View PDFOpen arXiv