SPIRe: Boosting LLM Inference Throughput with Speculative Decoding

Open in new window