agree that the sketched argument based on the Bernstein-von Mises theorem is simpler, yet it relies - at least in its

Neural Information Processing Systems 

We thank the reviewer for the insightful comments on the proof. We will clarify better in the main text notions like "overparamaterise" or "fully trained". We further evaluate the robustness of deep ensembles on a subset of the NNs employed in Section 5.3. Attacks are performed on 1k test points from the MNIST dataset. For deterministic NNs Theorem 1 does not hold.