Input Compression with Positional Consistency for Efficient Training and Inference of Transformer Neural Networks