EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization

Open in new window