LoQT: Low Rank Adapters for Quantized Training

Loeschcke, Sebastian, Toftrup, Mads, Kastoryano, Michael J., Belongie, Serge, Snæbjarnarson, Vésteinn

May-26-2024–arXiv.org Artificial Intelligence

Training of large neural networks requires significant computational resources. Despite advances using low-rank adapters and quantization, pretraining of models such as LLMs on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose LoQT, a method for efficiently training quantized models. LoQT uses gradient-based tensor factorization to initialize low-rank trainable weight matrices that are periodically merged into quantized full-rank weight matrices. Our approach is suitable for both pretraining and fine-tuning of models, which we demonstrate experimentally for language modeling and downstream task adaptation. We find that LoQT enables efficient training of models up to 7B parameters on a consumer-grade 24GB GPU. We also demonstrate the feasibility of training a 13B parameter model using per-layer gradient updates on the same hardware.

language model, loqt, quantization, (14 more...)

arXiv.org Artificial Intelligence

May-26-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Denmark > Capital Region
    - Copenhagen (0.05)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.68)
  - Machine Learning
    - Neural Networks (0.89)
    - Statistical Learning > Gradient Descent (0.56)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found