Data-Centric and Heterogeneity-Adaptive Sequence Parallelism for Efficient LLM Training