Tutorial Proposal: Speculative Decoding for Efficient LLM Inference

Open in new window