AITopics

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)

Neural Information Processing SystemsFeb-8-2026, 03:44:42 GMT

123fd8a56501194823c8e0dca00733df-Paper-Conference.pdf

fine-tuning, hydralora, matrix, (15 more...)

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

arXiv.org Artificial IntelligenceOct-22-2025

Noise-Robustness Through Noise: A Framework combining Asymmetric LoRA with Poisoning MoE

Wang, Zhaokun, Guo, Jinyu, Pu, Jingwen, Chen, Lingfeng, Pu, Hongli, Ou, Jie, Qin, Libo, Tian, Wenhong

Current parameter-efficient fine-tuning methods for adapting pre-trained language models to downstream tasks are susceptible to interference from noisy data. Conventional noise-handling approaches either rely on laborious data pre-processing or employ model architecture modifications prone to error accumulation. In contrast to existing noise-process paradigms, we propose a noise-robust adaptation method via asymmetric LoRA poisoning experts (LoPE), a novel framework that enhances model robustness to noise only with generated noisy data. Drawing inspiration from the mixture-of-experts architecture, LoPE strategically integrates a dedicated poisoning expert in an asymmetric LoRA configuration. Through a two-stage paradigm, LoPE performs noise injection on the poisoning expert during fine-tuning to enhance its noise discrimination and processing ability. During inference, we selectively mask the dedicated poisoning expert to leverage purified knowledge acquired by normal experts for noise-robust output. Extensive experiments demonstrate that LoPE achieves strong performance and robustness purely through the low-cost noise injection, which completely eliminates the requirement of data cleaning.

large language model, machine learning, natural language, (24 more...)

2505.23868

Country:

Asia > China (0.28)
North America > United States > Minnesota (0.28)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
(2 more...)

arXiv.org Artificial IntelligenceOct-14-2025

MeTA-LoRA: Data-Efficient Multi-Task Fine-Tuning for Large Language Models

Cheng, Bo, Wang, Xu, Liu, Jinda, Chang, Yi, Wu, Yuan

Low-Rank Adaptation (LoRA) has emerged as one of the most widely used parameter-efficient fine-tuning (PEFT) methods for adapting large language models (LLMs) to downstream tasks. While highly effective in single-task settings, it struggles to efficiently leverage inter-task knowledge in complex multi-task learning scenarios, often requiring substantial task-specific data to achieve optimal performance. To address this limitation, we introduce MeTA-LoRA, a two-stage optimization framework that significantly improves data efficiency in multi-task adaptation. In the first stage, task-specific LoRA adapters are learned using only a few samples from each involved dataset, enabling rapid adaptation without large-scale supervision. In the second stage, the shared LoRA adapter is updated by aggregating gradients from multiple tasks to promote knowledge transfer across tasks, further reducing data usage by leveraging common patterns. In both multi-task learning and multilingual learning scenarios, our method matches or surpasses the performance of traditional full-data LoRA fine-tuning approaches, while using significantly less task-specific data.

arxiv preprint arxiv, large language model, machine learning, (14 more...)

2510.11598

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsOct-9-2025, 18:57:49 GMT

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

B matrices to avoid training interference.

hydralora, lora, matrix, (16 more...)

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

arXiv.org Artificial IntelligenceSep-30-2025

ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation

Liang, Jian, Huang, Wenke, Guo, Xianda, Wan, Guancheng, Du, Bo, Ye, Mang

Low-Rank Adaptation (LoRA) is widely adopted for downstream fine-tuning of foundation models due to its efficiency and zero additional inference cost. Many real-world applications require foundation models to specialize in several specific tasks simultaneously, motivating the need for efficient multi-task downstream adaptation. To address this need, existing studies have primarily explored two directions: Model Merging with LoRA, which shows advantages in training-free scenarios but still lags behind multi-task training in overall performance; and MoE-based LoRA approaches, which improve multi-task learning performance but introduce routers that hinder the mergeability of LoRA parameters and incur considerable inference overhead, thereby limiting real-world deployment practicality. To this end, we propose ThanoRA, a Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation framework that enables effective, efficient and unified multi-task downstream adaptation without introducing additional structure. ThanoRA performs multi-task learning by tailoring subspace allocation at initialization and enforcing diversity preservation throughout training: it allocates varying dimensions to construct task-specific low-rank subspaces driven by inter-task heterogeneity, enabling fine-grained knowledge injection, while diversity-preserving regularization mitigates task interference and subspace collapse, thereby fully exploiting the low-rank capacity. Extensive experiments across multimodal and text-only benchmarks under varying multi-task mixtures demonstrate that ThanoRA consistently outperforms strong baselines, surpassing even separate task-specific fine-tuning, while introducing no additional structures or inference overhead. Our code will be publicly available at: https://github.com/LiangJian24/ThanoRA.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

2505.1864

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsMay-26-2025, 16:37:55 GMT

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

Adapting Large Language Models (LLMs) to new tasks through fine-tuning has been made more efficient by the introduction of Parameter-Efficient Fine-Tuning (PEFT) techniques, such as LoRA. However, these methods often underperform compared to full fine-tuning, particularly in scenarios involving complex datasets. This issue becomes even more pronounced in complex domains, highlighting the need for improved PEFT approaches that can achieve better performance. Through a series of experiments, we have uncovered two critical insights that shed light on the training and parameter inefficiency of LoRA. Building on these insights, we have developed HydraLoRA, a LoRA framework with an asymmetric structure that eliminates the need for domain expertise.

artificial intelligence, large language model, natural language, (4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.65)

arXiv.org Artificial IntelligenceFeb-21-2025

R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task Learning

Liu, Jinda, Chang, Yi, Wu, Yuan

Fine-tuning large language models (LLMs) is prohibitively expensive in terms of computational and memory costs. Low-rank Adaptation (LoRA), as one of the most popular parameter-efficient fine-tuning (PEFT) methods, offers a cost-effective alternative by approximating the model changes $\Delta W \in \mathbb{R}^{m \times n}$ through the product of down-projection matrix $A \in \mathbb{R}^{m \times r}$ and head matrix $B \in \mathbb{R}^{r \times n}$, where $r \ll \min(m, n)$. In real-world scenarios, LLMs are fine-tuned on data from multiple domains to perform tasks across various fields, embodying multi-task learning (MTL). LoRA often underperforms in such complex scenarios. To enhance LoRA's capability in multi-task learning, we propose R-LoRA, which incorporates Multi-Head Randomization. Multi-Head Randomization diversifies the head matrices through Multi-Head Random Initialization and Multi-Head Dropout, enabling more efficient learning of task-specific features while maintaining shared knowledge representation. Extensive experiments demonstrate that R-LoRA is better at capturing task-specific knowledge, thereby improving performance in multi-task scenarios. The code is available at https://github.com/jinda-liu/R-LoRA.

head matrix, matrix, r-lora, (13 more...)

2502.15455

Country: Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMay-23-2024

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

Tian, Chunlin, Shi, Zhan, Guo, Zhijiang, Li, Li, Xu, Chengzhong

Adapting Large Language Models (LLMs) to new tasks through fine-tuning has been made more efficient by the introduction of Parameter-Efficient Fine-Tuning (PEFT) techniques, such as LoRA. However, these methods often underperform compared to full fine-tuning, particularly in scenarios involving complex datasets. This issue becomes even more pronounced in complex domains, highlighting the need for improved PEFT approaches that can achieve better performance. Through a series of experiments, we have uncovered two critical insights that shed light on the training and parameter inefficiency of LoRA. Building on these insights, we have developed HydraLoRA, a LoRA framework with an asymmetric structure that eliminates the need for domain expertise. Our experiments demonstrate that HydraLoRA outperforms other PEFT approaches, even those that rely on domain knowledge during the training and inference phases.

hydralora, lora, matrix, (15 more...)