No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models Jean Kaddour 1 Oscar Key