The Impact of Initialization on LoRA Finetuning Dynamics
–Neural Information Processing Systems
In this paper, we study the role of initialization in Low Rank Adaptation (LoRA) as originally introduced in Hu et al. (2021). Essentially, to start from the pretrained model, one can either initialize B to zero and A to random, or vice-versa. In both cases, the product BA is equal to zero at initialization, which makes finetuning starts from the pretrained model. These two initialization schemes are seemingly similar. They should in-principle yield the same performance and share the same optimal learning rate.
Neural Information Processing Systems
May-27-2025, 18:05:34 GMT
- Technology: