ec24a54d62ce57ba93a531b460fa8d18-AuthorFeedback.pdf

Neural Information Processing Systems 

To Reviewer #1: Choosing any two distinct scalars on the line is equivalent. Therefore, we choose 0, 1 to simplify the discussion. We experimented on using Sigmoid functions, and it does not work. The best performance is 27% on CIFAR10 (worse than a simple kNN). To Reviewer #2: For beam search, n is very large.