Fast Inference via Hierarchical Speculative Decoding

Open in new window