AITopics | hyperparameter selection

Collaborating Authors

hyperparameter selection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Statistically Valid Hyperparameter Selection: From Tuning to Guarantees

Farzaneh, Amirmohammad, Simeone, Osvaldo

arXiv.org Machine LearningJun-25-2026

Hyperparameter selection is a critical step in the deployment of modern artificial intelligence systems, given the need to tune degrees of freedom such as inference-time parameters, implementation-level settings, and thresholds driving decision rules. Despite its practical importance, hyperparameter selection is typically performed using best-effort empirical methods such as grid search or Bayesian optimization, which provide no formal statistical guarantees on reliability or safety. This monograph presents a unified statistical framework for reliable hyperparameter selection, centered on the learn-then-test (LTT) paradigm, which formulates the problem as multiple hypothesis testing over a candidate set of hyperparameters. The framework enables the selection of hyperparameters that provably satisfy application-specific reliability requirements -- such as bounds on average risk, quantile risk, or information-theoretic constraints -- with explicit, finite-sample control of error probabilities. The supporting statistical machinery, namely p-values, e-values, and concentration inequalities, is developed from first principles in a dedicated appendix.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2606.25601

Genre: Research Report (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs

Neural Information Processing SystemsJun-16-2026, 09:25:50 GMT

The selection of hyperparameters, such as prompt templates in large language models (LLMs), must often strike a balance between reliability and cost. In many cases, structural relationships between the expected reliability levels of the hyperparameters can be inferred from prior information and held-out data - e.g., longer prompt templates may be more detailed and thus more reliable. However, existing hyperparameter selection methods either do not provide formal reliability guarantees or are unable to incorporate structured knowledge in the hyperparameter space. This paper introduces reliability graph-based Pareto testing (RG-PT), a novel multi-objective hyperparameter selection framework that maintains formal reliability guarantees in terms of false discovery rate (FDR), while accounting for known relationships among hyperparameters via a directed acyclic graph. Edges in the graph reflect expected reliability and cost trade-offs among hyperparameters, which are inferred via the Bradley-Terry (BT) ranking model from prior information and held-out data. Experimental evaluations demonstrate that RG-PT significantly outperforms existing methods such as learn-then-test (LTT) and Pareto testing (PT) through a more efficient exploration of the hyperparameter space.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

c23ccf9eedf87e4380e92b75b24955bb-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 18:26:05 GMT

hyperparameter, learning, training data, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Parameter-free HE-friendly Logistic Regression

Neural Information Processing SystemsDec-24-2025, 01:42:40 GMT

Privacy in machine learning has been widely recognized as an essential ethical and legal issue, because the data used for machine learning may contain sensitive information. Homomorphic encryption has recently attracted attention as a key solution to preserve privacy in machine learning applications. However, current approaches on the training of encrypted machine learning have relied heavily on hyperparameter selection, which should be avoided owing to the extreme difficulty of conducting validation on encrypted data. In this study, we propose an effective privacy-preserving logistic regression method that is free from the approximation of the sigmoid function and hyperparameter selection. In our framework, a logistic regression model can be transformed into the corresponding ridge regression for the logit function. We provide a theoretical background for our framework by suggesting a new generalization error bound on the encrypted data. Experiments on various real-world data show that our framework achieves better classification results while reducing latency by $\sim68\%$, compared to the previous models.

hyperparameter selection, name change, parameter-free he-friendly logistic regression, (3 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (0.84)
Law (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Adaptive Federated Learning via Dynamical System Model

Agarwal, Aayushya, Pileggi, Larry, Joshi, Gauri

arXiv.org Artificial IntelligenceOct-7-2025

Hyperparameter selection is critical for stable and efficient convergence of heterogeneous federated learning, where clients differ in computational capabilities, and data distributions are non-IID. Tuning hyperparameters is a manual and computationally expensive process as the hyperparameter space grows combinatorially with the number of clients. To address this, we introduce an end-to-end adaptive federated learning method in which both clients and central agents adaptively select their local learning rates and momentum parameters. Our approach models federated learning as a dynamical system, allowing us to draw on principles from numerical simulation and physical design. Through this perspective, selecting momentum parameters equates to critically damping the system for fast, stable convergence, while learning rates for clients and central servers are adaptively selected to satisfy accuracy properties from numerical simulation. The result is an adaptive, momentum-based federated learning algorithm in which the learning rates for clients and servers are dynamically adjusted and controlled by a single, global hyperparameter. By designing a fully integrated solution for both adaptive client updates and central agent aggregation, our method is capable of handling key challenges of heterogeneous federated learning, including objective inconsistency and client drift. Importantly, our approach achieves fast convergence while being insensitive to the choice of the global hyperparameter, making it well-suited for rapid prototyping and scalable deployment. Compared to state-of-the-art adaptive methods, our framework is shown to deliver superior convergence for heterogeneous federated learning while eliminating the need for hyperparameter tuning both client and server updates.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2510.04203

Genre: Research Report > New Finding (0.68)

Industry:

Energy (0.47)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Invariance Learning based on Label Hierarchy

Neural Information Processing SystemsAug-18-2025, 16:01:11 GMT

Deep Neural Networks inherit biased correlations embedded in training data and hence may fail to predict desired labels on unseen domains (or environments), which have different distributions from the domain to provide training data. Invariance Learning (IL) has been developed recently to overcome this shortcoming; using training data in many domains, IL estimates such a predictor that is invariant to a change of domain. However, the requirement of training data in multiple domains is a strong restriction of using IL, since it demands expensive annotation. We propose a novel IL framework to overcome this problem. Assuming the availability of data from multiple domains for a classification task at a higher level, for which the labeling cost is lower, we estimate an invariant predictor for the target classification task with training data gathered in a single domain. Additionally, we propose two cross-validation methods for selecting hyperparameters of invariance regularization, which has not been addressed properly in existing IL methods. The effectiveness of the proposed framework, including the cross-validation, is demonstrated empirically. Theoretical analysis reveals that our framework can estimate the desirable invariant predictor with a hyperparameter fixed correctly, and that such a preferable hyperparameter is chosen by the proposed CV methods under some conditions.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Reviews: Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

Neural Information Processing SystemsMay-31-2025, 21:42:11 GMT

My concern about generalization still remains, and I hope the authors can devote maybe a sentence or two to it in the final draft - even something to the effect of "it is a concern; experimental evidence suggests it is not a great concern."] Summary: For any given ML algorithm, e.g., random forests, the paper proposes a transfer-learning approach for selection of hyperparameters (limited to those parameters that can be ordered) wherein a bounding space is constructed from previous evaluations of that algorithm on other datasets. Two types of bounding spaces are described. The box space is the tightest bounding box covering the best known hyperparameter settings for previous datasets. The ellipsoid is found as the smallest-volume ellipsoid covering the best known settings (via convex optimization).

bayesian optimization, dataset, search space, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.55)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.44)

Add feedback

Conformal Calibration: Ensuring the Reliability of Black-Box AI in Wireless Systems

Simeone, Osvaldo, Park, Sangwoo, Zecchin, Matteo

arXiv.org Artificial IntelligenceApr-29-2025

AI is poised to revolutionize telecommunication networks by boosting efficiency, automation, and decision-making. However, the black-box nature of most AI models introduces substantial risk, possibly deterring adoption by network operators. These risks are not addressed by the current prevailing deployment strategy, which typically follows a best-effort train-and-deploy paradigm. This paper reviews conformal calibration, a general framework that moves beyond the state of the art by adopting computationally lightweight, advanced statistical tools that offer formal reliability guarantees without requiring further training or fine-tuning. Conformal calibration encompasses pre-deployment calibration via uncertainty quantification or hyperparameter selection; online monitoring to detect and mitigate failures in real time; and counterfactual post-deployment performance analysis to address "what if" diagnostic questions after deployment. By weaving conformal calibration into the AI model lifecycle, network operators can establish confidence in black-box AI models as a dependable enabling technology for wireless systems. A. Motivation Next-generation wireless networks are expected to leverage AI for tasks ranging from physical-layer processing to resource management. Initiatives like O-RAN exemplify this trend by defining open network architectures that enable data-driven control at different time scales via modular AI applications [1]. While AI promises improved efficiency and flexibility, most AI apps function as black boxes, raising significant reliability concerns. These reliability concerns may make operators hesitant to cede network functionalities to black-box systems without additional safeguards.

artificial intelligence, conformal prediction, prediction, (18 more...)

arXiv.org Artificial Intelligence

2504.0931

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Telecommunications (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Ensuring Reliability via Hyperparameter Selection: Review and Advances

Farzaneh, Amirmohammad, Simeone, Osvaldo

arXiv.org Artificial IntelligenceFeb-6-2025

Hyperparameter selection is a critical step in the deployment of artificial intelligence (AI) models, particularly in the current era of foundational, pre-trained, models. By framing hyperparameter selection as a multiple hypothesis testing problem, recent research has shown that it is possible to provide statistical guarantees on population risk measures attained by the selected hyperparameter. This paper reviews the Learn-Then-Test (LTT) framework, which formalizes this approach, and explores several extensions tailored to engineering-relevant scenarios. These extensions encompass different risk measures and statistical guarantees, multi-objective optimization, the incorporation of prior knowledge and dependency structures into the hyperparameter selection process, as well as adaptivity. The paper also includes illustrative applications for communication systems.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2502.04206

Country: