Reviews: Compositional De-Attention Networks

Neural Information Processing Systems 

UPDATE after reading author rebuttal: I am looking forward to the more comprehensive evaluation that you are carrying out. Regarding Q3, please include details of the setup in the main paper. Also, more analysis needed regarding why zeroes are predominant in M in the main paper (also a point raised by R3) - rather than speculation or hypothesis. Overall, my opinion of the paper does not change and feel it is a good direction of research. This paper proposes an alternative to softmax-based attention mechanism - a quasi-attention technique: A dual affinity matrix approach is proposed compared to the usual single affinity matrix. One affinity matrix is created from the pairwise similarity computation.