EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models

Pan, Xuchen, Chen, Yanxi, Li, Yaliang, Ding, Bolin, Zhou, Jingren

Feb-1-2024–arXiv.org Artificial Intelligence

This work introduces EE-Tuning, a lightweight and economical solution to training/tuning early-exit large language models (LLMs). In contrast to the common approach of full-parameter pre-training, EE-Tuning augments any pre-trained (and possibly fine-tuned) standard LLM with additional early-exit layers that are tuned in a parameter-efficient manner, which requires significantly less computational resources and training data. Our implementation of EE-Tuning achieves outstanding training efficiency via extensive performance optimizations, as well as scalability due to its full compatibility with 3D parallelism. Results of systematic experiments validate the efficacy of EE-Tuning, confirming that effective early-exit LLM inference can be achieved with a limited training budget. In hope of making early-exit LLMs accessible to the community, we release the source code of our implementation of EE-Tuning at https://github.com/pan-x-c/EE-LLM.

early exit, early-exit layer, ee-tuning, (17 more...)

arXiv.org Artificial Intelligence

Feb-1-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
- Europe
  - Spain (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
- Asia
  - Malaysia (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)