VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections

Open in new window