Review for NeurIPS paper: Primal Dual Interpretation of the Proximal Stochastic Gradient Langevin Algorithm

Neural Information Processing Systems 

Additional Feedback: Post rebuttal: The authors addressed my comments. Therefore, I keep my score as'accept' but not higher as I think the clarity of the writing should be improved. When G is nonsmooth and proximable, using proximal maps lead to much faster convergence compared to using subgradients in the optimization case. It is therefore an important problem to investigate the sampling analogue of this scheme which is the topic of this paper. As mentioned, some previous work has been done on this problem, but this paper presents an approach that is most general (in terms of G being supported on a more general set) to date.