The Impact of Initialization on LoRA Finetuning Dynamics

May-27-2025, 18:05:34 GMT–Neural Information Processing Systems

In this paper, we study the role of initialization in Low Rank Adaptation (LoRA) as originally introduced in Hu et al. (2021). Essentially, to start from the pretrained model, one can either initialize B to zero and A to random, or vice-versa. In both cases, the product BA is equal to zero at initialization, which makes finetuning starts from the pretrained model. These two initialization schemes are seemingly similar. They should in-principle yield the same performance and share the same optimal learning rate.

initialization, lora finetuning dynamic, pretrained model, (1 more...)

Neural Information Processing Systems

May-27-2025, 18:05:34 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.88)