Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals
Ryu, Simo, Seo, Seunghyun, Yoo, Jaejun
–arXiv.org Artificial Intelligence
In this paper, we present an efficient method for storing fine-tuned models by leveraging the low-rank properties of weight residuals. Our key observation is that weight residuals in large overparameterized models exhibit even stronger low-rank characteristics. Based on this insight, we propose Efficient Residual Encoding (ERE), a novel approach that achieves efficient storage of fine-tuned model weights by approximating the low-rank weight residuals. Furthermore, we analyze the robustness of weight residuals and push the limit of storage efficiency by utilizing additional quantization and layer-wise rank allocation. Our experimental results demonstrate that our method significantly reduces memory footprint while preserving performance in various tasks and modalities.
arXiv.org Artificial Intelligence
May-28-2023
- Country:
- North America
- Dominican Republic (0.04)
- United States > Minnesota
- Hennepin County > Minneapolis (0.14)
- Europe
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Romania > Sud - Muntenia Development Region
- Asia > Middle East
- Jordan (0.04)
- North America
- Genre:
- Research Report > New Finding (0.66)
- Technology: