Deep learning hyper-parameter optimization is a tough task. Finding an appropriate network configuration is a key to success, however most of the times this labor is roughly done. In this work we introduce a novel library to tackle this problem, the Deep Learning Optimization Library: DLOPT. We briefly describe its architecture and present a set of use examples. This is an open source project developed under the GNU GPL v3 license and it is freely available at https://github.com/acamero/dlopt
This one might be obvious but here it is anyway: when creating a hyper parameters grid, some parameters should take exponential steps. For instance the numbers of nodes hidden layer, you should test 8,32,128 instead of 8,10,12,14, ... because it's safe to assume 10 won't be much different than 8 or 12 (those numbers are problem dependent of course, but you get the idea)
Optical scatterometry is a method to measure the size and shape of periodic micro- or nanostructures on surfaces. For this purpose the geometry parameters of the structures are obtained by reproducing experimental measurement results through numerical simulations. We compare the performance of Bayesian optimization to different local minimization algorithms for this numerical optimization problem. Bayesian optimization uses Gaussian-process regression to find promising parameter values. We examine how pre-computed simulation results can be used to train the Gaussian process and to accelerate the optimization.
Numerical optimization is an important tool in the field of computational physics in general and in nano-optics in specific. It has attracted attention with the increase in complexity of structures that can be realized with nowadays nano-fabrication technologies for which a rational design is no longer feasible. Also, numerical resources are available to enable the computational photonic material design and to identify structures that meet predefined optical properties for specific applications. However, the optimization objective function is in general non-convex and its computation remains resource demanding such that the right choice for the optimization method is crucial to obtain excellent results. Here, we benchmark five global optimization methods for three typical nano-optical optimization problems from the field of shape optimization and parameter reconstruction: downhill simplex optimization, the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm, particle swarm optimization, differential evolution, and Bayesian optimization. In these examples, Bayesian optimization, mainly known from machine learning applications, obtains significantly better results in a fraction of the run times of the other optimization methods.
As deep learning techniques advance more than ever, hyper-parameter optimization is the new major workload in deep learning clusters. Although hyper-parameter optimization is crucial in training deep learning models for high model performance, effectively executing such a computation-heavy workload still remains a challenge. We observe that numerous trials issued from existing hyper-parameter optimization algorithms share common hyper-parameter sequence prefixes, which implies that there are redundant computations from training the same hyper-parameter sequence multiple times. We propose a stage-based execution strategy for efficient execution of hyper-parameter optimization algorithms. Our strategy removes redundancy in the training process by splitting the hyper-parameter sequences of trials into homogeneous stages, and generating a tree of stages by merging the common prefixes. Our preliminary experiment results show that applying stage-based execution to hyper-parameter optimization algorithms outperforms the original trial-based method, saving required GPU-hours and end-to-end training time by up to 6.60 times and 4.13 times, respectively.