Thanks to all of the reviewers for their time and effort, and both constructive and critical comments

Neural Information Processing Systems 

Proving that "at convergence, the parameters of the approximate factors can be considered fixed" may We use the ML of the approximate GP in our work and also in EP . Reviewer 1 puts it well, that "replacing it [KL] by the L2 Wasserstein should strike the majority of researchers as an If the CDF is accessible, Equation (5), which forms the basis of our lookup tables, is stable and it avoids divergence. If the CDF isn't accessible, there are double integrations We thank reviewer 1 for their supportive comments and helpful suggestions on e.g. the broader impact, "Found analytically for fewer distributions than EP" While we agree Please see General Comment 1. "not clear if the proposed method is worth it" Please Please see General Comments 2.1 and 2.2. Besides, Figure 1.a and 1.b illustrate the effectiveness of our method in alleviating the over-estimation of variances "primary motivation and reason for pursuing QP over EP" Please see General Comment 3. "No analysis of fixed Approximate Inference, Opper & Winther] and proving the property pointed out by Reviewer 2 that "at convergence, However, we cannot yet rule out that our method is already provably convergent under appropriate assumptions. We will also use the extra page in the final version for this. "I'm not sure whether the page on the locality property is We ask the reviewer to kindly consider the broader relevance outlined in General Comment 3. "Note sure whether Please see General Comment 1. "The degree of novelty is pretty small" Please see General Comment 3. "marginal likelihood and its accuracy" Please see General Comments 2.1 and 2.2. Please see General Comment 4. "A discussion why values for p Appendix B. These cases are interesting but also even more challenging to handle.