Reviews: Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Neural Information Processing Systems 

Update after author response: Thanks for the detailed response! It's a strong submission and I vote for an accept. This paper aims to speed up the computation of the softmax over a large vocabulary, which is quite common in some NLP tasks like e.g., language modeling. Specifically, the proposed method formulates the problem into a nearest neighbor search in a small world graph, and applies a log time algorithm to find the approximate top K predictions. The resulting time complexity reduces to logarithmic in the vocabulary size in expectation, in contrast to the linear one in a standard softmax.