AITopics | lora-pro

Collaborating Authors

lora-pro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LoRA-Pro: Are Low-Rank Adapters Properly Optimized?

Wang, Zhengbo, Liang, Jian

arXiv.org Artificial IntelligenceJul-25-2024

Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models. Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning. In this paper, we first uncover a fundamental connection between the optimization processes of LoRA and full fine-tuning: using LoRA for optimization is mathematically equivalent to full fine-tuning using a low-rank gradient for parameter updates. And this low-rank gradient can be expressed in terms of the gradients of the two low-rank matrices in LoRA. Leveraging this insight, we introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of these low-rank matrices. This adjustment allows the low-rank gradient to more accurately approximate the full fine-tuning gradient, thereby narrowing the performance gap between LoRA and full fine-tuning. Furthermore, we theoretically derive the optimal solutions for adjusting the gradients of the low-rank matrices, applying them during fine-tuning in LoRA-Pro. We conduct extensive experiments across natural language understanding, dialogue generation, mathematical reasoning, code generation, and image classification tasks, demonstrating that LoRA-Pro substantially improves LoRA's performance, effectively narrowing the gap with full fine-tuning. Foundational models (Radford et al., 2021; Brown et al., 2020; Achiam et al., 2023; Kirillov et al., 2023; Rombach et al., 2022; Touvron et al., 2023) have become the cornerstone of modern deep learning. Remarkably, some foundation models even demonstrate emergent properties (Hoffmann et al., 2022; Kaplan et al., 2020). Due to these advantages, foundational models have been widely applied to various downstream applications. Nevertheless, it still requires additional fine-tuning when applied to downstream tasks, where the huge parameter size of foundation models result in high cost in this stage.

fine-tuning, gradient, lora-pro, (16 more...)

arXiv.org Artificial Intelligence

2407.18242

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback