detector
Robust Hypothesis Testing Using Wasserstein Uncertainty Sets
We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered around the empirical distribution defined via Wasserstein metric, thus our approach is data-driven and free of distributional assumptions. We develop a convex safe approximation of the minimax formulation and show that such approximation renders a nearly-optimal detector among the family of all possible tests. By exploiting the structure of the least favorable distribution, we also develop a tractable reformulation of such approximation, with complexity independent of the dimension of observation space and can be nearly sample-size-independent in general. Real-data example using human activity data demonstrated the excellent performance of the new robust detector.
Towards Anytime-Valid Statistical Watermarking
Huang, Baihe, Xu, Eric, Ramchandran, Kannan, Jiao, Jiantao, Jordan, Michael I.
The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize the optimal e-value with respect to the worst-case log-growth rate and derive the optimal expected stopping time. Our theoretical claims are substantiated by simulations and evaluations on established benchmarks, showing that our framework can significantly enhance sample efficiency, reducing the average token budget required for detection by 13-15% relative to state-of-the-art baselines.
- Asia > Middle East > Jordan (0.41)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > United States > Massachusetts > Middlesex County > Burlington (0.04)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- (4 more...)
- Asia > China > Guangxi Province > Nanning (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia (0.04)
- North America > Canada (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Colorado (0.04)
- Asia > China > Zhejiang Province (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Maryland (0.05)
- North America > Canada (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)