Compact Language Models via Pruning and Knowledge Distillation

Mar-20-2026, 05:39:56 GMT–Neural Information Processing Systems

Large language models (LLMs) targeting different deployment scales and sizes are currently produced by training each variant from scratch; this is extremely compute-intensive. In this paper, we investigate if pruning an existing LLM and then re-training it with a fraction <3% of the original training data can be a suitable alternative to repeated, full retraining.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Mar-20-2026, 05:39:56 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)