ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
Liu, Lawrence, Liu, Alexander, Wang, Mengdi, Zhao, Tuo, Yang, Lin F.
–arXiv.org Artificial Intelligence
While semi-structured pruning, particularly 2:4 sparsity, offers a path to practical hardware acceleration, existing methods often incur substantial performance degradation. To bridge this gap, we introduce ARMOR: (Adaptive Representation with Matrix-factORization), a novel one-shot post-training pruning algorithm. Instead of directly pruning weights, ARMOR factorizes each weight matrix into a 2:4 sparse core wrapped by two low-overhead, block diagonal matrices. These wrappers act as efficient pre-and post-transformation error correctors, offering greater flexibility to preserve model quality compared to conventional 2:4 pruning techniques. The sparse core and block diagonal wrappers are chosen through a block coordinate descent algorithm that minimizes a layer-wise proxy loss. We theoretically prove this optimization is guaranteed to converge to a solution with a proxy loss less than or equal to state-of-the-art pruning algorithms. Experiments on Llama (Touvron et al., 2023; Dubey et al., 2024) and Qwen (Y ang et al., 2025) model families demonstrate that ARMOR consistently and significantly outperforms state-of-the-art 2:4 pruning methods across a wide range of downstream tasks and perplexity evaluations. ARMOR achieves this superior performance while retaining the inference speedups and substantial memory usage reductions of 2:4 pruning, establishing a more effective trade-off between model compression and task accuracy. Large Language Models (LLMs) have demonstrated remarkable capabilities (Park et al., 2023; Huang & Y ang, 2025), yet their immense computational and memory requirements pose significant barriers to practical deployment.
arXiv.org Artificial Intelligence
Oct-8-2025
- Country:
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Education (0.46)
- Technology: