Effective Long-Context Scaling of Foundation Models