Bayesian Low-rank Adaptation for Large Language Models
Yang, Adam X., Robeyns, Maxime, Wang, Xi, Aitchison, Laurence
–arXiv.org Artificial Intelligence
Low-rank adaptation (LoRA) has emerged as a new paradigm for cost-efficient finetuning of large language models (LLMs). However, fine-tuned LLMs often become overconfident especially when fine-tuned on small datasets. Bayesian methods, with their inherent ability to estimate uncertainty, serve as potent tools to mitigate overconfidence and enhance calibration. In this work, we introduce Laplace-LoRA, which applies a Bayesian approach to the LoRA parameters. Specifically, Laplace-LoRA applies a Laplace approximation to the posterior over the LoRA parameters, considerably improving the calibration of fine-tuned LLMs. In recent years, fine-tuning large language models (LLMs) have become increasingly important (Houlsby et al., 2019; Hu et al., 2021; Liu et al., 2022; Ding et al., 2022; 2023). Fine-tuning is used both to adapt LLMs for specific tasks and to create general instruction-following models (e.g. using Reinforcement Learning from Human Feedback; RLHF Wei et al., 2021; Ouyang et al., 2022; Chung et al., 2022; Wang et al., 2022). However, fine-tuned LLMs have a notable limitation: they often exhibit overconfidence (Jiang et al., 2021; Xiao et al., 2022; He et al., 2023; Tian et al., 2023; OpenAI, 2023). This is particularly problematic in safety-critical applications or when making decisions in areas where limited data is available, such as medical diagnosis, finance and experimental design (Singhal et al., 2022; Wu et al., 2023; Lampinen et al., 2023; Li et al., 2022). Consequently, there is an urgent need for strategies that enhance the calibration of fine-tuned LLMs, ensuring that their predictions are as trustworthy as they are powerful. Bayesian deep learning is commonly proposed as a solution to overconfidence in deep networks (e.g. Historically, the field of Bayesian deep learning has frequently considered ResNets for image classification (Shridhar et al., 2019; Dusenberry et al., 2020; Izmailov et al., 2021).
arXiv.org Artificial Intelligence
Feb-5-2024
- Country:
- North America > United States
- Massachusetts > Hampshire County > Amherst (0.04)
- Europe > Romania
- Asia > Myanmar
- Tanintharyi Region > Dawei (0.04)
- North America > United States
- Genre:
- Research Report (0.83)
- Industry:
- Health & Medicine > Diagnostic Medicine (0.34)