EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models

Open in new window