Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers

Open in new window