Improving N-gram Language Models with Pre-trained Deep Transformer