LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers