TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding

Open in new window