Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing

Rausch, Roman, Jansen, David, Singh, Sukhbinder, Orús, Román

Dec-4-2025–arXiv.org Machine Learning

Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.

artificial intelligence, large language model, natural language, (10 more...)

arXiv.org Machine Learning

Dec-4-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Spain
  - Basque Country > Biscay Province > Bilbao (0.04)
- North America > Canada
  - Ontario > Toronto (0.04)

Genre:
- Research Report (0.70)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)