EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test

Jun-22-2026, 17:43:15 GMT–Neural Information Processing Systems

The sequential nature of modern LLMs makes them expensive and slow, and speculative sampling has proven to be an effective solution to this problem. Methods like EAGLE perform autoregression at the feature level, reusing top-layer features from the target model to achieve better results than vanilla speculative sampling. A growing trend in the LLM community is scaling up training data to improve model intelligence without increasing inference costs. However, we observe that scaling up data provides limited improvements for EAGLE. We identify that this limitation arises from EAGLE's feature prediction constraints.

large language model, machine learning, target model, (18 more...)

Neural Information Processing Systems

Jun-22-2026, 17:43:15 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.14)

Genre:
- Research Report > Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found