Parallel Speculative Decoding with Adaptive Draft Length

Open in new window