Descending through a Crowded Valley -- Benchmarking Deep Learning Optimizers

Schmidt, Robin M., Schneider, Frank, Hennig, Philipp

arXiv.org Machine Learning 

Choosing the optimizer is considered to be among the most crucial design decisions in deep learning, and it is not an easy one. The growing literature now lists hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often made based on anecdotes. In this work, we aim to replace these anecdotes, if not with a conclusive ranking, then at least with evidence-backed heuristics. To do so, we perform an extensive, standardized benchmark of more than a dozen particularly popular deep learning optimizers while giving a concise overview of the wide range of possible choices. Analyzing almost 35,000 individual runs, we contribute the following three points: (i) Optimizer performance varies greatly across tasks. This subset includes popular favorites and some lesser-known contenders. We have open-sourced all our experimental results, making them directly available as challenging and well-tuned baselines. This allows for more meaningful comparisons when evaluating novel optimization methods without requiring any further computational efforts. Large-scale stochastic optimization drives a wide variety of machine learning tasks. Because choosing the right optimization algorithm and effectively tuning its hyperparameters heavily influences the training speed and final performance of the learned model, doing so is an important, everyday challenge to practitioners. Hence, stochastic optimization methods have been a focal point of research (cf. Figure 1), engendering an ever-growing list of algorithms, many of them specifically targeted towards deep learning.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found