Goto

Collaborating Authors

 empirical mean


Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox

Neural Information Processing Systems

We further show the mismatched sampling paradox: A learner who knows the rewards distributions and samples from the correct posterior distribution can perform exponentially worse than a learner who does not know the rewards and simply samples from a well-chosen Gaussian posterior.





Batches

Neural Information Processing Systems

In this paper, we find an appealing way to synthesize [JO19] and [CLM19] to give the best of both worlds: an algorithm which runs in polynomial time and can exploit structure in the underlying distribution to achieve sublinear sample complexity.




Tight Bounds for Answering Adaptively Chosen Concentrated Queries

Rapoport, Emma, Cohen, Edith, Stemmer, Uri

arXiv.org Artificial Intelligence

Most work on adaptive data analysis assumes that samples in the dataset are independent. When correlations are allowed, even the non-adaptive setting can become intractable, unless some structural constraints are imposed. To address this, Bassily and Freund [2016] introduced the elegant framework of concentrated queries, which requires the analyst to restrict itself to queries that are concentrated around their expected value. While this assumption makes the problem trivial in the non-adaptive setting, in the adaptive setting it remains quite challenging. In fact, all known algorithms in this framework support significantly fewer queries than in the independent case: At most $O(n)$ queries for a sample of size $n$, compared to $O(n^2)$ in the independent setting. In this work, we prove that this utility gap is inherent under the current formulation of the concentrated queries framework, assuming some natural conditions on the algorithm. Additionally, we present a simplified version of the best-known algorithms that match our impossibility result.


Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox

Neural Information Processing Systems

We further show the mismatched sampling paradox: A learner who knows the rewards distributions and samples from the correct posterior distribution can perform exponentially worse than a learner who does not know the rewards and simply samples from a well-chosen Gaussian posterior.