AITopics | optformer

Thus, most meta and transfer-learning HPO methods [7-16] consider a restrictive setting where all tasks must share the same set of hyperparameters so that the input data can be represented as fixed-sizedvectors.

artificial intelligence, machine learning, optimization, (19 more...)

Neural Information Processing Systems

Country: Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

2561721d0ca69bab22b749cfc4f48f6c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 01:33:35 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
(14 more...)

Genre: Research Report (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Towards Learning Universal Hyperparameter Optimizers with Transformers

Neural Information Processing SystemsDec-25-2025, 08:26:32 GMT

Meta-learning hyperparameter optimization (HPO) algorithms from prior experiments is a promising approach to improve optimization efficiency over objective functions from a similar distribution. However, existing methods are restricted to learning from experiments sharing the same set of hyperparameters. In this paper, we introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction when trained on vast tuning data from the wild, such as Google's Vizier database, one of the world's largest HPO datasets. Our extensive experiments demonstrate that the OptFormer can simultaneously imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates. Compared to a Gaussian Process, the OptFormer also learns a robust prior distribution for hyperparameter response functions, and can thereby provide more accurate and better calibrated predictions. This work paves the path to future extensions for training a Transformer-based model as a general HPO optimizer.

learning universal hyperparameter optimizer, name change, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Hung, Yu-Heng, Lin, Kai-Jie, Lin, Yu-Heng, Wang, Chien-Yi, Sun, Cheng, Hsieh, Ping-Chun

arXiv.org Artificial IntelligenceMay-30-2025

Bayesian optimization (BO) offers an efficient pipeline for optimizing black-box functions with the help of a Gaussian process prior and an acquisition function (AF). Recently, in the context of single-objective BO, learning-based AFs witnessed promising empirical results given its favorable non-myopic nature. Despite this, the direct extension of these approaches to multi-objective Bayesian optimization (MOBO) suffer from the \textit{hypervolume identifiability issue}, which results from the non-Markovian nature of MOBO problems. To tackle this, inspired by the non-Markovian RL literature and the success of Transformers in language modeling, we present a generalized deep Q-learning framework and propose \textit{BOFormer}, which substantiates this framework for MOBO via sequence modeling. Through extensive evaluation, we demonstrate that BOFormer constantly outperforms the benchmark rule-based and learning-based algorithms in various synthetic MOBO and real-world multi-objective hyperparameter optimization problems. We have made the source code publicly available to encourage further research in this direction.

machine learning, natural language, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2505.21974

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

Towards Learning Universal Hyperparameter Optimizers with Transformers

Neural Information Processing SystemsJan-18-2025, 22:47:33 GMT

Meta-learning hyperparameter optimization (HPO) algorithms from prior experiments is a promising approach to improve optimization efficiency over objective functions from a similar distribution. However, existing methods are restricted to learning from experiments sharing the same set of hyperparameters. In this paper, we introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction when trained on vast tuning data from the wild, such as Google's Vizier database, one of the world's largest HPO datasets. Our extensive experiments demonstrate that the OptFormer can simultaneously imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates. Compared to a Gaussian Process, the OptFormer also learns a robust prior distribution for hyperparameter response functions, and can thereby provide more accurate and better calibrated predictions. This work paves the path to future extensions for training a Transformer-based model as a general HPO optimizer.

learning universal hyperparameter optimizer, prediction, transformer, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multi-step Planning for Automated Hyperparameter Optimization with OptFormer

Dery, Lucio M., Friesen, Abram L., De Freitas, Nando, Ranzato, Marc'Aurelio, Chen, Yutian

arXiv.org Artificial IntelligenceNov-16-2022

As machine learning permeates more industries and models become more expensive and time consuming to train, the need for efficient automated hyperparameter optimization (HPO) has never been more pressing. Multi-step planning based approaches to hyperparameter optimization promise improved efficiency over myopic alternatives by more effectively balancing out exploration and exploitation. However, the potential of these approaches has not been fully realized due to their technical complexity and computational intensity. In this work, we leverage recent advances in Transformer-based, natural-language-interfaced hyperparameter optimization to circumvent these barriers. We build on top of the recently proposed OptFormer which casts both hyperparameter suggestion and target function approximation as autoregressive generation thus making planning via rollouts simple and efficient. We conduct extensive exploration of different strategies for performing multi-step planning on top of the OptFormer model to highlight its potential for use in constructing non-myopic HPO strategies.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.04971

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Multi-step Planning for Automated Hyperparameter Optimization with OptFormer

#artificialintelligenceOct-13-2022, 02:06:20 GMT

Unlike myopic HPO methods, planning based approaches fundamentally require building models of the future to assess the impact of a current decision on later timesteps. Though these methods also rely on a GP as a surrogate model, each point in multi-step planning involves fantasizing/imagining an updated GP posterior ( ft 1 xt),…,( ft h xt, xt 1,…, xt h 1) based on simulated choices from lookaheads {( xt, yt),…,( xt h 1, yt h 1)} (Lam et al., 2016; Jiang et al., 2020). Note that we use xt to represent a fantasized decision, while xt is the actual choice made at timestep t. Whilst multi-step planning is promising, constructing the posterior of a GP model requires matrix inversion which is a compute-intensive operation (Cormen et al., 2022). Even outside of this limitation, traditional planning based approaches are compute intensive due to (i) poor scaling behavior of the search tree--O(qh) where q is the number of choices at each decision point for each lookahead step (Lam et al., 2016; Lam and Willcox, 2017)--which forces most methods to explore short horizons, typically h {1,2}, and (ii) nested expectation and maximization: marginalizing future observation yt j,j h and global search on the acquisition function to obtain query xt j at every lookahead step.

automated hyperparameter optimization, multi-step planning, nested expectation and maximization, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.39)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.39)

Add feedback

Filters

Collaborating Authors

optformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

2561721d0ca69bab22b749cfc4f48f6c-Paper-Conference.pdf

APPENDIX APreprocessingandtokenizationdetails

cf6501108fced72ee5c47e2151c4e153-Paper-Conference.pdf

2561721d0ca69bab22b749cfc4f48f6c-Paper-Conference.pdf

Towards Learning Universal Hyperparameter Optimizers with Transformers

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Towards Learning Universal Hyperparameter Optimizers with Transformers

Multi-step Planning for Automated Hyperparameter Optimization with OptFormer

Multi-step Planning for Automated Hyperparameter Optimization with OptFormer