Draft-based Approximate Inference for LLMs

Open in new window