Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff

Open in new window