Confidence-Modulated Speculative Decoding for Large Language Models

Open in new window