CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation

Open in new window