Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models Lujun Li

Oct-10-2025, 22:49:35 GMT–Neural Information Processing Systems

In this paper, we present DSA, the first automated framework for discovering sparsity allocation schemes for layer-wise pruning in Large Language Models (LLMs). LLMs have become increasingly powerful, but their large parameter counts make them computationally expensive. Existing pruning methods for compressing LLMs primarily focus on evaluating redundancies and removing element-wise weights. However, these methods fail to allocate adaptive layer-wise sparsities, leading to performance degradation in challenging tasks.

allocation function, language model, opération, (15 more...)

Neural Information Processing Systems

Oct-10-2025, 22:49:35 GMT

Conferences PDF

Add feedback

Country:
- Asia > China
  - Hong Kong (0.04)
  - Heilongjiang Province > Harbin (0.04)
  - Guangdong Province
    - Shenzhen (0.04)
    - Guangzhou (0.04)

Genre:
- Research Report > Experimental Study (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
DiscoveringSparsityAllocationforLayer-wise PruningofLargeLanguageModels

Similar Docs Excel Report more

Title	Similarity	Source
None found