LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning

Jun-20-2026, 10:13:01 GMT–Neural Information Processing Systems

Quantization and fine-tuning are crucial for deploying large language models (LLMs) on resource-constrained edge devices. However, fine-tuning quantized models presents significant challenges, primarily stemming from: First, the mismatch in data types between the low-precision quantized weights (e.g., 4-bit) and the high-precision adaptation weights (e.g., 16-bit). This mismatch limits the computational efficiency advantage offered by quantized weights during inference.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Jun-20-2026, 10:13:01 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.68)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.90)
  - Machine Learning > Neural Networks (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found