Multi-Token Joint Speculative Decoding for Accelerating Large Language Model Inference

Open in new window