Goto

Collaborating Authors

 tester


Efficient Testable Learning of Halfspaces with Adversarial Label Noise

Neural Information Processing Systems

We give the first polynomial-time algorithm for the testable learning of halfspaces in the presence of adversarial label noise under the Gaussian distribution. In the recently introduced testable learning model, one is required to produce a tester-learner such that if the data passes the tester, then one can trust the output of the robust learner on the data. Our tester-learner runs in time $\text{poly}(d/\epsilon)$ and outputs a halfspace with misclassification error $O(\text{opt})+\epsilon$, where $\text{opt}$ is the 0-1 error of the best fitting halfspace. At a technical level, our algorithm employs an iterative soft localization technique enhanced with appropriate testers to ensure that the data distribution is sufficiently similar to a Gaussian. Finally, our algorithm can be readily adapted to yield an efficient and testable active learner requiring only $d ~ \text{polylog}(1/\epsilon)$ labeled examples.


Testing for Families of Distributions via the Fourier Transform

Neural Information Processing Systems

We study the general problem of testing whether an unknown discrete distribution belongs to a specified family of distributions. More specifically, given a distribution family P and sample access to an unknown discrete distribution D, we want to distinguish (with high probability) between the case that D in P and the case that D is ε-far, in total variation distance, from every distribution in P . This is the prototypical hypothesis testing problem that has received significant attention in statistics and, more recently, in computer science. The main contribution of this work is a simple and general testing technique that is applicable to all distribution families whose Fourier spectrum satisfies a certain approximate sparsity property. We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions. For the first two, ours are the first non-trivial testers in the literature, vastly generalizing previous work on testing Poisson Binomial Distributions. For the third, our tester improves on prior work in both sample and time complexity.



Private Identity Testing for High-Dimensional Distributions

Neural Information Processing Systems

In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in R^d with known covariance and product distributions over {\pm 1}^d. Our testers have improved sample complexity compared to those derived from previous techniques, and are the first testers whose sample complexity matches the order-optimal minimax sample complexity of O(d^1/2/alpha^2) in many parameter regimes. We construct two types of testers, exhibiting tradeoffs between sample complexity and computational complexity. Finally, we provide a two-way reduction between testing a subclass of multivariate product distributions and testing univariate distributions, and thereby obtain upper and lower bounds for testing this subclass of product distributions.


New Hearing Aid Company, Foretell, Brings in Steve Martin and Others as Fans

WIRED

Well, Who Do You Know? AI-powered startup Fortell has become a secret handshake for the privileged hearing-impaired crowd who swear by the product. Now, it wants to be in your ears. A secret is percolating at dinner parties, salons, and cocktail gatherings among the august New York City elite. It's whispered in the circles of financial masters of the universe, Hollywood stars, and owners of sports teams. Many haven't--or if they did hear, they might not have made out the words through noisy cross-conversations. Once they do know--particularly if they're boomers--they want it desperately. Fortell is a hearing aid, one that claims to use AI to provide a dramatically superior aural experience. The chosen few included in its beta test claim that it seems to top the performance of high-end devices they'd been unhappily using. These testers have made pilgrimages to Fortell's headquarters on the fifth floor of a WeWork facility in New York City's trendy SoHo neighborhood, where they were fitted for the hearing aids--which from the outside look pretty much like standard, over-the-ear, teardrop-shaped devices. But the big moment comes when a Fortell staffer takes them down to street level.


Sharp Bounds for Generalized Uniformity Testing

Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

Neural Information Processing Systems

We study the problem of generalized uniformity testing of a discrete probability distribution: Given samples from a probability distribution p over an unknown size discrete domain Ω, we want to distinguish, with probability at least 2/ 3, between the case that p is uniform on some subset of Ω versus null -far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, within constant factors, and a matching worst-case information-theoretic lower bound.


Testing for Families of Distributions via the Fourier Transform

Neural Information Processing Systems

We study the general problem of testing whether an unknown discrete distribution belongs to a specified family of distributions. More specifically, given a distribution family P and sample access to an unknown discrete distribution D, we want to distinguish (with high probability) between the case that D in P and the case that D is ε-far, in total variation distance, from every distribution in P . This is the prototypical hypothesis testing problem that has received significant attention in statistics and, more recently, in computer science. The main contribution of this work is a simple and general testing technique that is applicable to all distribution families whose Fourier spectrum satisfies a certain approximate sparsity property. We apply our Fourier-based framework to obtain near sample-optimal and computationally efficient testers for the following fundamental distribution families: Sums of Independent Integer Random Variables (SIIRVs), Poisson Multinomial Distributions (PMDs), and Discrete Log-Concave Distributions. For the first two, ours are the first non-trivial testers in the literature, vastly generalizing previous work on testing Poisson Binomial Distributions. For the third, our tester improves on prior work in both sample and time complexity.



A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

Neural Information Processing Systems

The computational complexity of a property tester is the number of adjacency list entries it reads, denoted its queries . Many works in graph property testing focus on testing plain graphs that contain only the pure combinatorial information.


Efficient Discrepancy Testing for Learning with Distribution Shift Gautam Chandrasekaran UT Austin Adam R. Klivans UT Austin Vasilis Kontonis UT Austin Konstantinos Stavropoulos

Neural Information Processing Systems

Our approach generalizes and improves all prior work on TDS learning: (1) we obtain universal learners that succeed simultaneously for large classes of test distributions, (2) achieve near-optimal error rates, and (3) give exponential improvements for constant depth circuits.