Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Jun-13-2026, 22:32:13 GMT–Neural Information Processing Systems

Large Language Models (LLMs) have demonstrated remarkable capabilities, yet their extensive parameter scales pose significant challenges for practical deployment. Unstructured pruning has emerged as an effective model compression strategy with minimal performance loss, which introduces fine-grained sparsity for weight parameters. While existing methods employ a layer-wise pruning strategy to avoid the complexity of global pruning for billion-scale LLMs, they require appropriate sparsity allocation for the layer-wise pruning objectives and often lead to suboptimal solutions for the overall model. In this paper, we propose Lua-LLM ($\textbf{L}$earning $\textbf{u}$nstructured-sparsity $\textbf{a}$llocation in LLMs), a learning-based global pruning framework that explores the optimal unstructured sparsity allocation. Unlike existing pruning methods, which primarily focus on allocating per-layer sparsity, Lua-LLM achieves flexible allocation for both layer-wise and intra-layer sparsity.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Jun-13-2026, 22:32:13 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)