Parallel Sampling via Autospeculation
Anari, Nima, Baronio, Carlo, Chen, CJ, Haqi, Alireza, Koehler, Frederic, Li, Anqi, Vuong, Thuy-Duong
–arXiv.org Artificial Intelligence
We present parallel algorithms to accelerate sampling via counting in two settings: any-order autoregressive models and denoising diffusion models. An any-order autoregressive model accesses a target distribution $μ$ on $[q]^n$ through an oracle that provides conditional marginals, while a denoising diffusion model accesses a target distribution $μ$ on $\mathbb{R}^n$ through an oracle that provides conditional means under Gaussian noise. Standard sequential sampling algorithms require $\widetilde{O}(n)$ time to produce a sample from $μ$ in either setting. We show that, by issuing oracle calls in parallel, the expected sampling time can be reduced to $\widetilde{O}(n^{1/2})$. This improves the previous $\widetilde{O}(n^{2/3})$ bound for any-order autoregressive models and yields the first parallel speedup for diffusion models in the high-accuracy regime, under the relatively mild assumption that the support of $μ$ is bounded. We introduce a novel technique to obtain our results: speculative rejection sampling. This technique leverages an auxiliary ``speculative'' distribution~$ν$ that approximates~$μ$ to accelerate sampling. Our technique is inspired by the well-studied ``speculative decoding'' techniques popular in large language models, but differs in key ways. Firstly, we use ``autospeculation,'' namely we build the speculation $ν$ out of the same oracle that defines~$μ$. In contrast, speculative decoding typically requires a separate, faster, but potentially less accurate ``draft'' model $ν$. Secondly, the key differentiating factor in our technique is that we make and accept speculations at a ``sequence'' level rather than at the level of single (or a few) steps. This last fact is key to unlocking our parallel runtime of $\widetilde{O}(n^{1/2})$.
arXiv.org Artificial Intelligence
Nov-12-2025
- Country:
- Asia > Japan
- Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Europe > Italy
- North America
- Canada > Alberta
- United States
- Arizona (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Asia > Japan
- Genre:
- Research Report (1.00)
- Technology: