barbarik2
On Scalable Testing of Samplers
In this paper we study the problem of testing of constrained samplers over high-dimensional distributions with $(\varepsilon,\eta,\delta)$ guarantees. Samplers are increasingly used in a wide range of safety-critical ML applications, and hence the testing problem has gained importance. For $n$-dimensional distributions, the existing state-of-the-art algorithm, $\mathsf{Barbarik2}$, has a worst case query complexity of exponential in $n$ and hence is not ideal for use in practice. Our primary contribution is an exponentially faster algorithm, $\mathsf{Barbarik3}$, that has a query complexity linear in $n$ and hence can easily scale to larger instances. We demonstrate our claim by implementing our algorithm and then comparing it against $\mathsf{Barbarik2}$. Our experiments on the samplers $\mathsf{wUnigen3}$ and $\mathsf{wSTS}$, find that $\mathsf{Barbarik3}$ requires $10\times$ fewer samples for $\mathsf{wUnigen3}$ and $450\times$ fewer samples for $\mathsf{wSTS}$ as compared to $\mathsf{Barbarik2}$.
On Testing of Samplers
Given a set of items F and a weight function W: F -> (0,1), the problem of sampling seeks to sample an item proportional to its weight. Sampling is a fundamental problem in machine learning. The daunting computational complexity of sampling with formal guarantees leads designers to propose heuristics-based techniques for which no rigorous theoretical analysis exists to quantify the quality of the generated distributions. This poses a challenge in designing a testing methodology to test whether a sampler under test generates samples according to a given distribution. Only recently, Chakraborty and Meel (2019) designed the first scalable verifier, called Barbarik, for samplers in the special case when the weight function W is constant, that is, when the sampler is supposed to sample uniformly from F. The techniques in Barbarik, however, fail to handle general weight functions. The primary contribution of this paper is an affirmative answer to the above challenge: motivated by Barbarik, but using different techniques and analysis, we design Barbarik2, an algorithm to test whether the distribution generated by a sampler is epsilon-close or eta-far from any target distribution. In contrast to black-box sampling techniques that require a number of samples proportional to |F|, Barbarik2 requires only \tilde{O}(Tilt(W, F)^2/eta(eta - 6*epsilon)^3) samples, where the Tilt is the maximum ratio of weights of two points in F. Barbarik2 can handle any arbitrary weight function. We present a prototype implementation of Barbarik2 and use it to test three state-of-the-art samplers.
- Asia > Singapore (0.05)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Singapore (0.05)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Asia > India > West Bengal > Kolkata (0.04)
- Asia > Singapore (0.05)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Singapore (0.05)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > India > West Bengal > Kolkata (0.04)
On Scalable Testing of Samplers
In this paper we study the problem of testing of constrained samplers over high-dimensional distributions with (\varepsilon,\eta,\delta) guarantees. Samplers are increasingly used in a wide range of safety-critical ML applications, and hence the testing problem has gained importance. For n -dimensional distributions, the existing state-of-the-art algorithm, \mathsf{Barbarik2}, has a worst case query complexity of exponential in n and hence is not ideal for use in practice. Our primary contribution is an exponentially faster algorithm, \mathsf{Barbarik3}, that has a query complexity linear in n and hence can easily scale to larger instances. We demonstrate our claim by implementing our algorithm and then comparing it against \mathsf{Barbarik2} .
On Testing of Samplers
Given a set of items F and a weight function W: F - (0,1), the problem of sampling seeks to sample an item proportional to its weight. Sampling is a fundamental problem in machine learning. The daunting computational complexity of sampling with formal guarantees leads designers to propose heuristics-based techniques for which no rigorous theoretical analysis exists to quantify the quality of the generated distributions. This poses a challenge in designing a testing methodology to test whether a sampler under test generates samples according to a given distribution. Only recently, Chakraborty and Meel (2019) designed the first scalable verifier, called Barbarik, for samplers in the special case when the weight function W is constant, that is, when the sampler is supposed to sample uniformly from F. The techniques in Barbarik, however, fail to handle general weight functions.