Goto

Collaborating Authors

 autolora


AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning

arXiv.org Artificial Intelligence

Large-scale pretraining followed by task-specific finetuning has achieved great success in various NLP tasks. Since finetuning all parameters of large pretrained models poses substantial computational and memory challenges, several efficient finetuning methods have been developed. Among them, low-rank adaptation (LoRA), which finetunes low-rank incremental update matrices on top of frozen pretrained weights, has proven particularly effective. Nonetheless, LoRA's uniform rank assignment across all layers, along with its reliance on an exhaustive search to find the best rank, leads to high computation costs and suboptimal finetuning performance. To address these limitations, we introduce AutoLoRA, a meta learning based framework for automatically identifying the optimal rank of each LoRA layer. AutoLoRA associates each rank-1 matrix in a low-rank update matrix with a selection variable, which determines whether the rank-1 matrix should be discarded. A meta learning based method is developed to learn these selection variables. The optimal rank is determined by thresholding the values of these variables. Our comprehensive experiments on natural language understanding, generation, and sequence labeling demonstrate the effectiveness of AutoLoRA.


AutoLoRa: A Parameter-Free Automated Robust Fine-Tuning Framework

arXiv.org Artificial Intelligence

With the emergence of foundation models (Bommasani et al., 2021), fine-tuning the pre-trained feature extractor (FE) has become a low-cost strategy to obtain superior performance in downstream tasks. Notably, GPT-3 (Brown et al., 2020) can achieve state-of-the-art (SOTA) performance on GLUE benchmarks (Wang et al., 2018) via parameterefficient fine-tuning (Hu et al., 2021). Due to the ubiquitous existence of adversarial attacks (Goodfellow et al., 2014; Madry et al., 2018), adopting pre-trained FEs to safety-critical downstream areas such as medicine (Buch et al., 2018) and autonomous cars (Kurakin et al., 2018) necessitates the strategy of robust fine-tuning (Hendrycks et al., 2019) that can yield adversarial robustness in downstream applications. Robust fine-tuning (RFT) (Hendrycks et al., 2019) that contains an adversarial objective to learn features of adversarial data (Madry et al., 2018) can gain adversarial robustness in downstream tasks. To further improve generalization, vanilla RFT (formulated in Eq. 1, shown in the left panel of Figure 1c) optimizes both adversarial and natural objectives to learn the features of adversarial and natural data simultaneously via the FE (Zhang et al., 2019; Shafahi et al., 2019; Jiang et al., 2020).