Optimal Algorithms for Augmented Testing of Discrete Distributions
–Neural Information Processing Systems
We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution p, extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribution, possibly derived from historical data or predictive machine learning models, is available. We demonstrate that such a predictor can indeed reduce the number of samples required for all three property testing tasks. The reduction in sample complexity depends directly on the predictor's quality, measured by its total variation distance from p.
Neural Information Processing Systems
May-28-2025, 13:22:09 GMT
- Country:
- Europe > Austria
- Vienna (0.14)
- North America
- Canada > Quebec (0.14)
- United States
- California (0.14)
- Massachusetts > Middlesex County
- Cambridge (0.14)
- New Jersey (0.14)
- New York (0.14)
- Europe > Austria
- Genre:
- Research Report > New Finding (0.93)
- Technology: