Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling