AITopics | hyperparameter learning

Collaborating Authors

hyperparameter learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hyperparameter Learning via Distributional Transfer

Neural Information Processing SystemsDec-25-2025, 12:05:54 GMT

distributional transfer, hyperparameter learning, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Reviews: Hyperparameter Learning via Distributional Transfer

Neural Information Processing SystemsJan-24-2025, 08:20:37 GMT

This paper proposed a novel method for transfer learning in Bayesian hyperparameter optimization based on the theory that the distributions of previously observed datasets contain significant information that should not be ignored during hyperparameter optimization on a new dataset. They propose solutions to compare different datasets through distribution estimation and then combine this information with the classical Bayesian hyperparameter optimization setup. Experiments show that the method outperforms selected baselines. Originality: the method is novel, although it mostly bridges ideas from various fields. Quality: I would like to congratulate the authors on a very well written paper.

distributional transfer, hyperparameter learning, hyperparameter optimization, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Hyperparameter Learning via Distributional Transfer

Neural Information Processing SystemsOct-10-2024, 04:41:27 GMT

Bayesian optimisation is a popular technique for hyperparameter learning but typically requires initial exploration even in cases where similar prior tasks have been solved. We propose to transfer information across tasks using learnt representations of training datasets used in those tasks. Representations make use of the framework of distribution embeddings into reproducing kernel Hilbert spaces. The developed method has a faster convergence compared to existing baselines, in some cases requiring only a few evaluations of the target objective.

distributional transfer, hyperparameter learning, representation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

Accelerating Multi-Block Constrained Optimization Through Learning to Optimize

Liang, Ling, Austin, Cameron, Yang, Haizhao

arXiv.org Artificial IntelligenceSep-25-2024

Learning to Optimize (L2O) approaches, including algorithm unrolling, plug-and-play methods, and hyperparameter learning, have garnered significant attention and have been successfully applied to the Alternating Direction Method of Multipliers (ADMM) and its variants. However, the natural extension of L2O to multi-block ADMM-type methods remains largely unexplored. Such an extension is critical, as multi-block methods leverage the separable structure of optimization problems, offering substantial reductions in per-iteration complexity. Given that classical multi-block ADMM does not guarantee convergence, the Majorized Proximal Augmented Lagrangian Method (MPALM), which shares a similar form with multi-block ADMM and ensures convergence, is more suitable in this setting. Despite its theoretical advantages, MPALM's performance is highly sensitive to the choice of penalty parameters. To address this limitation, we propose a novel L2O approach that adaptively selects this hyperparameter using supervised learning. We demonstrate the versatility and effectiveness of our method by applying it to the Lasso problem and the optimal transport problem. Our numerical results show that the proposed framework outperforms popular alternatives. Given its applicability to generic linearly constrained composite optimization problems, this work opens the door to a wide range of potential real-world applications.

algorithm, application, learning, (13 more...)

arXiv.org Artificial Intelligence

2409.1732

Country:

North America > United States > Maryland (0.04)
Europe > Italy > Lazio > Rome (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Power Industry (0.46)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models

Li, Rui, John, ST, Solin, Arno

arXiv.org Artificial IntelligenceJun-7-2023

Approximate inference in Gaussian process (GP) models with non-conjugate likelihoods gets entangled with the learning of the model hyperparameters. We improve hyperparameter learning in GP models and focus on the interplay between variational inference (VI) and the learning target. While VI's lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, we show that a direct approximation of the marginal likelihood as in Expectation Propagation (EP) is a better learning objective for hyperparameter optimization. We design a hybrid training procedure to bring the best of both worlds: it leverages conjugate-computation VI for inference and uses an EP-like marginal likelihood approximation for hyperparameter learning. We compare VI, EP, Laplace approximation, and our proposed training procedure and empirically demonstrate the effectiveness of our proposal across a wide range of data sets.

artificial intelligence, likelihood, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.04201

Country:

Europe > Finland (0.04)
North America > United States > New York (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.32)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms

Neural Information Processing SystemsApr-6-2023, 15:13:18 GMT

Semi-supervised learning algorithms have been successfully applied in many applications with scarce labeled data, by utilizing the unlabeled data. One important category is graph based semi-supervised learning algorithms, for which the performance depends considerably on the quality of the graph, or its hyperparameters. In this paper, we deal with the less explored problem of learning the graphs. We propose a graph learning method for the harmonic energy minimization method; this is done by minimizing the leave-one-out prediction error on labeled data points. We use a gradient based method and designed an efficient algorithm which significantly accelerates the calculation of the gradient by applying the matrix inversion lemma and using careful pre-computation.

graph, hyperparameter learning, semi-supervised learning algorithm, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Add feedback

Congratulations to the #ICML2022 outstanding paper award winners

AIHubJul-21-2022, 13:52:05 GMT

The International Conference on Machine Learning (ICML) Outstanding Paper awards are given to papers from the current conference that are "strong representatives of solid theoretical and empirical work in the field". This year, there were 15 awards. Monarch: Expressive structured matrices for efficient and accurate training Tri Dao, Beidi Chen, Nimit Sohoni, Arjun Desai, Michael Poli, Jessica Grogan, Alexander Liu, Aniruddh Rao, Atri Rudra, Christopher Re Abstract: Large neural networks excel in many domains, but they are expensive to train and fine-tune. A popular approach to reduce their compute or memory requirements is to replace dense weight matrices with structured ones (e.g., sparse, low-rank, Fourier transform). These methods have not seen widespread adoption (1) in end-to-end training due to unfavorable efficiency–quality tradeoffs, and (2) in dense-to-sparse fine-tuning due to lack of tractable algorithms to approximate a given dense weight matrix.

algorithm, marginal likelihood, matrix, (13 more...)

AIHub

Genre: Personal > Honors > Award (0.40)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Hyperparameter Learning via Distributional Transfer

Law, Ho Chung, Zhao, Peilin, Chan, Leung Sing, Huang, Junzhou, Sejdinovic, Dino

Neural Information Processing SystemsMar-18-2020, 23:17:17 GMT

distributional transfer, hyperparameter learning, representation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Hyperparameter Learning for Conditional Kernel Mean Embeddings with Rademacher Complexity Bounds

Hsu, Kelvin, Nock, Richard, Ramos, Fabio

arXiv.org Machine LearningNov-7-2018

Conditional kernel mean embeddings are nonparametric models that encode conditional expectations in a reproducing kernel Hilbert space. While they provide a flexible and powerful framework for probabilistic inference, their performance is highly dependent on the choice of kernel and regularization hyperparameters. Nevertheless, current hyperparameter tuning methods predominantly rely on expensive cross validation or heuristics that is not optimized for the inference task. For conditional kernel mean embeddings with categorical targets and arbitrary inputs, we propose a hyperparameter learning framework based on Rademacher complexity bounds to prevent overfitting by balancing data fit against model complexity. Our approach only requires batch updates, allowing scalable kernel hyperparameter tuning without invoking kernel approximations. Experiments demonstrate that our learning framework outperforms competing methods, and can be further extended to incorporate and learn deep neural network weights to improve generalization.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1809.00175

Country:

Oceania > Australia (0.46)
Asia (0.28)
North America > United States (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Hyperparameter Learning via Distributional Transfer

Law, Ho Chung Leon, Zhao, Peilin, Huang, Junzhou, Sejdinovic, Dino

arXiv.org Machine LearningOct-15-2018

Bayesian optimisation is a popular technique for hyperparameter learning but typically requires initial 'exploration' even in cases where potentially similar prior tasks have been solved. We propose to transfer information across tasks using kernel embeddings of distributions of training datasets used in those tasks. The resulting method has a faster convergence compared to existing baselines, in some cases requiring only a few evaluations of the target objective.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Machine Learning

1810.06305

Genre: Research Report (0.40)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback