Compressing Language Models for Specialized Domains
Williams, Miles, Chrysostomou, George, Jeronymo, Vitor, Aletras, Nikolaos
–arXiv.org Artificial Intelligence
Compression techniques such as pruning and quantization offer a solution for more efficient deployment of language models (LMs), albeit with small performance drops in benchmark performance. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this, yet requires computationally expensive full-parameter fine-tuning. To this end, we propose cross-calibration, a novel training-free approach for improving the domain performance of compressed LMs. Our approach effectively leverages Hessian-based sensitivity to identify weights that are influential for both in-domain and general performance. Through extensive experimentation, we demonstrate that cross-calibration substantially outperforms existing approaches on domain-specific tasks, without compromising general performance. Notably, these gains come without additional computational overhead, displaying remarkable potential towards extracting domain-specialized compressed models from general-purpose LMs.
arXiv.org Artificial Intelligence
Feb-25-2025
- Country:
- Asia
- China (0.14)
- Middle East > UAE (0.14)
- Thailand (0.14)
- Europe
- North America
- Canada (0.14)
- Mexico > Mexico City (0.14)
- United States
- California (0.14)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Asia
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Government (1.00)
- Health & Medicine (1.00)
- Law (1.00)
- Technology: