Sliding Window Attention Training for Efficient Large Language Models