Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers

Open in new window