GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
–Neural Information Processing Systems
Speculative decoding accelerates inference in large language models (LLMs) by generating multiple draft tokens simultaneously.
Neural Information Processing Systems
Jun-13-2026, 23:52:31 GMT
- Technology: