Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
I like the paper, the idea is well described and the experiments are convincing to a certain degree. The best thing in my opinion is that the authors tried to analyze the learned networks with respect to the pattern of gate outputs. Therefore, the response value is never exactly 0 or 1 (this was also stated by the authors) and the gradients in eq. 5 are not correct. The authors should explain in the paper how backpropagation is exactly performed in these networks. I would like to see a plot of the performance with respect to the initial value of the bias.
Neural Information Processing Systems
Feb-6-2025, 13:53:02 GMT
- Technology: