VeLoRA: MemoryEfficientTrainingusing Rank-1Sub-TokenProjections

Feb-12-2026, 13:46:59 GMT–Neural Information Processing Systems

Using a single projection vector, we then project these individual sub-tokens onto a one-dimensional subspace. Importantly, we notice that we can initialize this projection vector cheaply using first-order batch statistics andthen keepitfixedthroughout training. Wethen reconstruct the original tokens using the same vector during the backward pass.

justification, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Feb-12-2026, 13:46:59 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.69)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
4a9eaf6dff3fdac9ab1aaf4c0fe2d563-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found