EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

Open in new window