Reviews: Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog

Neural Information Processing Systems 

For MNIST, the approximated answerer is count-based and its recognition accuracy can be controlled proportional to the actual answerer's accuracy. For GuessWhat, the approximated answerer is trained in a variety of ways -- on the same training data as the actual answerer, on predicted answers from the actual answerer, on a different training data split as the actual answerer, and on a different training data split as the actual answerer followed by imitation of predicted answers on the other split. And the proposed approach outperforms the random baseline. Interestingly, the authors find that the depA* models perform better than the indA* models -- showing that training on predicted answers is a stronger signal for building an accurate mental model than just sharing training data. I'm happy to recommend this for publication.