H3T: Efficient Integration of Memory Optimization and Parallelism for High-Throughput Transformer Training Y uzhong Wang