A General and Efficient Training for Transformer via Token Expansion

Open in new window