Reviews: A Stein variational Newton method

Neural Information Processing Systems 

Summary: SVGD iteratively moves a set of particles toward the target by choosing a perturbative direction to maximumly decrease the KL divergence with the target distribution in RKHS. The paper proposes to add second-order information into SVGD updates, preliminary empirical results show that their method converges faster in few cases. The paper is well written, and the proofs seem correct. An important reason in using second-order information is the hope to achieve a faster convergence rate. My major concern is a lack of theoretical analysis of convergence rate in this paper: 1) An appealing property of SVGD is that the optimal decreasing rate equals to Stein discrepancy D_F(q p), where F is a function set that includes all possible velocity fields.