Understanding and Alleviating Memory Consumption in RLHF for LLMs

Zhou, Jin, Yang, Hanmei, Steven, null, Tang, null, Xiang, Mingcan, Guan, Hui, Liu, Tongping

Oct-21-2024–arXiv.org Artificial Intelligence

Fine-tuning with Reinforcement Learning with Human Feedback (RLHF) is essential for aligning large language models (LLMs). However, RLHF often encounters significant memory challenges. This study is the first to examine memory usage in the RLHF context, exploring various memory management strategies and unveiling the reasons behind excessive memory consumption. Additionally, we introduce a simple yet effective approach that substantially reduces the memory required for RLHF fine-tuning.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Oct-21-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.28)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)