Training Long-Context LLMs Efficiently via Chunk-wise Optimization