Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers