Speculative Decoding with Big Little Decoder
–Neural Information Processing Systems
Figure 1: Illustration of (Left) the normal autoregressive decoding procedure of a large model and (Right) BiLD that consists of a small model and a large model. In BiLD, the small model generates tokens autoregressively (i.e., sequentially) until it hands over control to the large model.
Neural Information Processing Systems
Oct-8-2025, 23:12:30 GMT
- Country:
- Asia
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Middle East > Jordan (0.04)
- Japan > Honshū
- North America > United States
- California > Alameda County
- Berkeley (0.04)
- Maryland > Baltimore (0.04)
- California > Alameda County
- Asia
- Genre:
- Research Report > New Finding (0.67)
- Technology: