Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism