Review for NeurIPS paper: Decentralized Langevin Dynamics for Bayesian Learning

Neural Information Processing Systems 

I think the distributed setting and all the subtleties that come with it should have been explored better. I was left wondering about the communication costs of an architecture needed for this to work, and potential issues with that. One obvious question would be, how would the iterate updates at each node be impacted by random noise injected into the w_{js} being passed around in the comms channels and/or random drops / missed updates. The only difference between the convergence discussion in \S4.1 and other works in the literature that use similar machinery seems to be the formulations for the extra constants/iterate weights in the distributed setting. This reduces the novelty/significance of that section somewhat, in my opinion.