Review for NeurIPS paper: Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping