AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Online Learning with an Unknown Fairness Metric

Stephen Gillen, Christopher Jung, Michael Kearns, Aaron Roth

Neural Information Processing SystemsMar-15-2026, 07:35:17 GMT

We consider the problem of online learning in the linear contextual bandits setting, but in which there are also strong individual fairness constraints governed by an unknown similarity metric. These constraints demand that we select similar actions or individuals with approximately equal probability [?], which may be at odds with optimizing reward, thus modeling settings where profit and social policy are in tension. We assume we learn about an unknown Mahalanobis similarity metric from only weak feedback that identifies fairness violations, but does not quantify their extent. This is intended to represent the interventions of a regulator who "knows unfairness when he sees it" but nevertheless cannot enunciate a quantitative fairness metric over individuals. Our main result is an algorithm in the adversarial context setting that has a number of fairness violations that depends only logarithmically on T, while obtaining an optimal O( T) regret bound to the best fair policy.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Towards Anytime-Valid Statistical Watermarking

Huang, Baihe, Xu, Eric, Ramchandran, Kannan, Jiao, Jiantao, Jordan, Michael I.

arXiv.org Machine LearningFeb-20-2026

The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize the optimal e-value with respect to the worst-case log-growth rate and derive the optimal expected stopping time. Our theoretical claims are substantiated by simulations and evaluations on established benchmarks, showing that our framework can significantly enhance sample efficiency, reducing the average token budget required for detection by 13-15% relative to state-of-the-art baselines.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Machine Learning

2602.17608

Country:

Asia > Middle East > Jordan (0.41)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Massachusetts > Middlesex County > Burlington (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

ABiasMetrics

Neural Information Processing SystemsFeb-19-2026, 12:34:32 GMT

Ninedifferentdebiasing algorithms (and a baseline) have been evaluated with this dataset using the popular ResNet-18 network[36]. CelebA contains faces of celebrities with several binary task labelsandtwoprotected labels(genderandyouth). Table 3showsthe prediction results from a biased binary classifier and its bias values using the seven metrics. Without losing generality, we consider "Sport" the positive class in the binary classifier. Following the DP formula in Appendix A.2, for the "Sport" class, thePPRfemale is 45.0% (90 /200), andPPRmale is65.0%

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback