A multilevel approach to accelerate the training of Transformers

Open in new window