LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning
Zhang, Longteng, Zhang, Lin, Shi, Shaohuai, Chu, Xiaowen, Li, Bo
–arXiv.org Artificial Intelligence
The low-rank adaptation (LoRA) method can largely reduce the amount of trainable parameters for fine-tuning large language models (LLMs), however, it still requires expensive activation memory to update low-rank weights. Reducing the number of LoRA layers or using activation recomputation could harm the finetuning performance or increase the computational overhead. In this work, we present LoRA-FA, a memory-efficient fine-tuning method that reduces the activation memory without performance degradation and expensive recomputation. LoRA-FA chooses to freeze the projection-down weight of A and update the projection-up weight of B in each LoRA layer. It ensures the change of model weight reside in a low-rank space during LLMs fine-tuning, while eliminating the requirement to store full-rank input activations. We conduct extensive experiments across multiple model types (RoBERTa, T5, LLaMA) and model scales. Our results show that LoRA-FA can always achieve close fine-tuning accuracy across different tasks compared to full parameter fine-tuning and LoRA. Furthermore, LoRA-FA can reduce the overall memory cost by up to 1.4 compared to LoRA. However, fine-tuning LLMs with full parameter is prohibitively expensive, for example, fine-tuning a LLaMA-65B (Touvron et al., 2023a) model with AdamW (Loshchilov & Hutter, 2017) requires more than 1TB of GPU memory to store model parameter, gradient, and optimizer states (Rajbhandari et al., 2020). To reduce the memory of full-parameter fine-tuning, parameter-efficient fine-tuning (PEFT) methods are proposed to update only a small fraction of parameters, such as adapter weights (Houlsby et al., 2019; Hu et al., 2022) and prompt weights (Li & Liang, 2021; Lester et al., 2021).
arXiv.org Artificial Intelligence
Aug-7-2023
- Country:
- Asia
- China
- Guangdong Province
- Heilongjiang Province > Harbin (0.04)
- Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- China
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Ireland > Leinster
- North America > United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Asia
- Genre:
- Research Report > New Finding (0.68)
- Technology: