EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test

Open in new window