Review for NeurIPS paper: How does Weight Correlation Affect Generalisation Ability of Deep Neural Networks?

Neural Information Processing Systems 

Weaknesses: * I worry that the claims about the measure being theoretically grounded are wrong, or at least misleading. The way I understand it, the paper introduces a method - WCD - which minimises weight correlation along with the loss. In order to provide performance guarantees like in Eq. (3) for this method, one would have to compute the posterior Q that WCD actually gives rise to. Instead, the paper defines a separate posterior, which is inspired by similar concepts, but essentially comes from nowhere and has no reason to be tied to WCD. I therefore find the discussion in Section 4 misleading.