Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
–Neural Information Processing Systems
While parameter-efficient fine-tuning (PEFT) methods aim to reduce the memory usage of the optimizer state during fine-tuning, the inherent size of pre-trained LLM weights continues to be a pressing concern. Even though quantization techniques are widely proposed to ease memory demands and accelerate LLM inference, most of these techniques are geared towards the deployment phase.
Neural Information Processing Systems
Feb-14-2026, 05:49:10 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Ireland > Leinster
- North America
- Dominican Republic (0.04)
- United States
- California > Los Angeles County
- Long Beach (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Los Angeles County
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.93)
- Technology: