Bayesian Learning
Efficient Bayesian Learning Curve Extrapolation using Prior-Data Fitted Networks
Learning curve extrapolation aims to predict model performance in later epochs of training, based on the performance in earlier epochs. In this work, we argue that, while the inherent uncertainty in the extrapolation of learning curves warrants a Bayesian approach, existing methods are (i) overly restrictive, and/or (ii) computationally expensive.
A Proofs from Section 2 448 Algorithm 4: Output ˆ α null G1 (1 η
Return ˆ α We show the following generalization of Proposition 2.1. Moreover, Alg. 4 has sample complexity The sample complexity is clear so we focus on the first statement. Theorem 4.5 in [MU17]) on these events as i varies and noting that Hence recalling (A.2) above, we conclude that The other direction is similar. Using (A.2) in the same way as above, we find First we analyze the expected sample complexity. Finally Alg. 4 has sample complexity We do this using Bayes' rule.