Using BERT for state-of-the-art pre-training for natural language processing