Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Sieradzki, Szymon, Mańdziuk, Jacek

arXiv.org Artificial Intelligence 

In this paper we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks. In machine learning, the performance of a model heavily depends on the correct choice of hyperparameters, such as the learning rate, the number of layers in a neural network, or specific regularization techniques. These hyperparameters form a multidimensional space where some dimensions are continuous (e.g., the learning rate), while others are discrete (e.g., the number of network layers). The task of Hyperparameter Optimization (HPO) aims to find the best combination of these hyperparameters by searching this space in a way that optimizes a predefined objective function. In supervised learning, this function is usually a loss function, which quantifies an error between the predictions of the model and the true values. HPO is applicable across a wide range of machine learning models, as most optimization techniques are agnostic to the underlying model type. The core requirement for any HPO algorithm is to define the hyperparameter space and the objective function. However, HPO presents specific challenges that separate it from other optimization problems, making it a unique area in the field. Each evaluation of the objective function requires training the considered machine learning model from scratch, which is often the most time-consuming part of the optimization process. As a result, when designing HPO algorithms, the focus is less on the internal computational efficiency of the optimizer but rather on minimizing the number of objective function evaluations, while still maintaining good predictive performance.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found