Goto

Collaborating Authors

 monotonicity





A Proofs

Neural Information Processing Systems

D.2 Countries Hyperparameters are summarized in table 6. We ran all experiments on a single CPU (Apple M2). 15 optimizer AdamW learning rate 0.0003 learning rate schedule cosine training epochs 100 weight decay 0.00001 batch size 4 embedding dimensions 10 embedding initialization one-hot, fixed neural networks LeNet5 max search depth / Table 5: Hyperparameters for the MNIST -addition experiments.


We present conditional monotonicity results using alternative estimators of performance quality

Neural Information Processing Systems

The Appendix is structured as follows: We provide a proof of conditional guarantees in EENNs for (hard) PoE in Appendix A . We conduct an ablation study for our P A model in Appendix B.2 . We report results of NLP experiments in Appendix B.4 . We discuss anytime regression and deep ensembles in Appendix B.6 . We propose a technique for controlling the violations of conditional monotonicity in P A in Appendix B.8 .






accordingly to incorporate the comments. Reviewer # 1: (Stepsize and preset T.) Following the current analysis, for a general stepsize η

Neural Information Processing Systems

We appreciate the valuable comments and positive feedback from the reviewers. Without averaging the iterates, no convergence rate is available. In this paper we consider neural network with one hidden layer. In particular, Proposition 4.7 shows that neural TD attains the global minimum of MSBE (without the We will revise the "without loss of generality" claim in the revision. We will clarify this notation in the revision.