Deep learning hyper-parameter optimization is a tough task. Finding an appropriate network configuration is a key to success, however most of the times this labor is roughly done. In this work we introduce a novel library to tackle this problem, the Deep Learning Optimization Library: DLOPT. We briefly describe its architecture and present a set of use examples. This is an open source project developed under the GNU GPL v3 license and it is freely available at https://github.com/acamero/dlopt
Regret matching is a widely-used algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides, to our knowledge, the first principled warm-starting method for no-regret learning. It also extends to warm-starting the widely-adopted counterfactual regret minimization (CFR) algorithm for large incomplete-information games; we show this experimentally as well. We then study optimizing a parameter vector for a player in a two-player zero-sum game (e.g., optimizing bet sizes to use in poker). We propose a custom gradient descent algorithm that provably finds a locally optimal parameter vector while leveraging our warm-start theory to significantly save regret-matching iterations at each step. It optimizes the parameter vector while simultaneously finding an equilibrium. We present experiments in no-limit Leduc Hold’em and no-limit Texas Hold’em to optimize bet sizing. This amounts to the first action abstraction algorithm (algorithm for selecting a small number of discrete actions to use from a continuum of actions—a key preprocessing step for solving large games using current equilibrium-finding algorithms) with convergence guarantees for extensive-form games.
This one might be obvious but here it is anyway: when creating a hyper parameters grid, some parameters should take exponential steps. For instance the numbers of nodes hidden layer, you should test 8,32,128 instead of 8,10,12,14, ... because it's safe to assume 10 won't be much different than 8 or 12 (those numbers are problem dependent of course, but you get the idea)
Optical scatterometry is a method to measure the size and shape of periodic micro- or nanostructures on surfaces. For this purpose the geometry parameters of the structures are obtained by reproducing experimental measurement results through numerical simulations. We compare the performance of Bayesian optimization to different local minimization algorithms for this numerical optimization problem. Bayesian optimization uses Gaussian-process regression to find promising parameter values. We examine how pre-computed simulation results can be used to train the Gaussian process and to accelerate the optimization.
Numerical optimization is an important tool in the field of computational physics in general and in nano-optics in specific. It has attracted attention with the increase in complexity of structures that can be realized with nowadays nano-fabrication technologies for which a rational design is no longer feasible. Also, numerical resources are available to enable the computational photonic material design and to identify structures that meet predefined optical properties for specific applications. However, the optimization objective function is in general non-convex and its computation remains resource demanding such that the right choice for the optimization method is crucial to obtain excellent results. Here, we benchmark five global optimization methods for three typical nano-optical optimization problems from the field of shape optimization and parameter reconstruction: downhill simplex optimization, the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm, particle swarm optimization, differential evolution, and Bayesian optimization. In these examples, Bayesian optimization, mainly known from machine learning applications, obtains significantly better results in a fraction of the run times of the other optimization methods.