LD-MoLE: Learnable Dynamic Routing for Mixture of LoRA Experts

Zhuang, Yuan, Shen, Yi, Bian, Yuexin, Su, Qing, Ji, Shihao, Shi, Yuanyuan, Miao, Fei

Oct-1-2025–arXiv.org Artificial Intelligence

Recent studies have shown that combining parameter-efficient fine-tuning (PEFT) with mixture-of-experts (MoE) is an effective strategy for adapting large language models (LLMs) to the downstream tasks. However, most existing approaches rely on conventional TopK routing, which requires careful hyperparameter tuning and assigns a fixed number of experts to each token. In this work, we propose LD-MoLE, a Learnable Dynamic routing mechanism for Mixture of LoRA Experts that enables adaptive, token-dependent, and layer-wise expert allocation. Our method replaces the non-differentiable TopK selection with a differentiable routing function and a closed-form solution. Moreover, our design allows the model to adaptively determine the number of experts to activate for each token at different layers. In addition, we introduce an analytical sparsity control objective to regularize the number of activated experts. Our method not only achieves superior performance, but also demonstrates the ability to learn token-dependent and layer-wise expert allocation. Large language models (LLMs) have demonstrated impressive capabilities across a wide range of natural language processing (NLP) tasks. However, their growing size requires significant computational resources for full-parameter fine-tuning. To address this, Parameter-Efficient Fine-tuning (PEFT) methods, such as Adapter-tuning (Houlsby et al., 2019) and LoRA (Hu et al., 2021), have emerged as crucial techniques for reducing training costs. Recently, the Mixture-of-Experts (MoE) design (Jacobs et al., 1991; Shazeer et al., 2017) has been successfully integrated into transformer feed-forward networks during LLMs pretraining (Dai et al., 2024; Y ang et al., 2025), demonstrating that MoE can reduce computational cost while maintaining strong performance.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-1-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report (0.85)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Perceptrons (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found