Goto

Collaborating Authors

 optuna


012a91467f210472fab4e11359bbfef6-AuthorFeedback.pdf

Neural Information Processing Systems

First, as R4 suggested, "symbolic35 tree" was more approachable for people in the ML community. Second, the symbolic tree is declared by the user using36 decorators and serves to represent high-level program constructs, which is different from the AST that represents all37 the syntactic structures for the program. For example, the full Python AST contains information about objects' class38 methods, whereas our symbolic representation does not.39 R4: "Second, most of their tool/language design could be summarized as adding some kind of non determinis-40 tic/parametric choice ... It's extension to ML does not introduce anything particularly new ..."41 We agree with R4 that symbolic programming and non-deterministic programming are well-studied topics in the PL42 community. However, we would like to emphasize that this work is the first to introduce such concepts to AutoML43 to significantly reduce engineering effort, which is a novel and useful contribution. For example, PyGlove leverages44 symbolic manipulation to decouple the search algorithm, search space and child program, which enabled us to unify45 the interface among search methods with and without weight sharing. To enable symbolic programming in Python,46 PyGlove implements an object model for maintaining the consistency of program state during symbolic manipulation.47 R4 "Provide the grammar in the main text"48 We understand the "grammar" here as a reference to the formal definition of the search space specification. We will49 revise current Appendix Table 3 into a formal definition, and add it to the "search space" sub-section.50



Enhancing Speech Emotion Recognition via Fine-Tuning Pre-Trained Models and Hyper-Parameter Optimisation

arXiv.org Artificial Intelligence

We propose a workflow for speech emotion recognition (SER) that combines pre-trained representations with automated hyperparameter optimisation (HPO). Using SpeechBrain wav2vec2-base model fine-tuned on IEMOCAP as the encoder, we compare two HPO strategies, Gaussian Process Bayesian Optimisation (GP-BO) and Tree-structured Parzen Estimators (TPE), under an identical four-dimensional search space and 15-trial budget, with balanced class accuracy (BCA) on the German EmoDB corpus as the objective. All experiments run on 8 CPU cores with 32 GB RAM. GP-BO achieves 0.96 BCA in 11 minutes, and TPE (Hyperopt implementation) attains 0.97 in 15 minutes. In contrast, grid search requires 143 trials and 1,680 minutes to exceed 0.9 BCA, and the best AutoSpeech 2020 baseline reports only 0.85 in 30 minutes on GPU. For cross-lingual generalisation, an EmoDB-trained HPO-tuned model improves zero-shot accuracy by 0.25 on CREMA-D and 0.26 on RAVDESS. Results show that efficient HPO with pre-trained encoders delivers competitive SER on commodity CPUs. Source code to this work is available at: https://github.com/youngaryan/speechbrain-emotion-hpo.


SHAPoint: Task-Agnostic, Efficient, and Interpretable Point-Based Risk Scoring via Shapley Values

arXiv.org Artificial Intelligence

Interpretable risk scores play a vital role in clinical decision support, yet traditional methods for deriving such scores often rely on manual preprocessing, task-specific modeling, and simplified assumptions that limit their flexibility and predictive power. We present SHAPoint, a novel, task-agnostic framework that integrates the predictive accuracy of gradient boosted trees with the interpretability of point-based risk scores. SHAPoint supports classification, regression, and survival tasks, while also inheriting valuable properties from tree-based models, such as native handling of missing data and support for monotonic constraints. Compared to existing frameworks, SHAPoint offers superior flexibility, reduced reliance on manual preprocessing, and faster runtime performance. Empirical results show that SHAPoint produces compact and interpretable scores with predictive performance comparable to state-of-the-art methods, but at a fraction of the runtime, making it a powerful tool for transparent and scalable risk stratification.


Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning?

arXiv.org Artificial Intelligence

Optimal hyperparameter selection is critical for maximizing the performance of neural networks in computer vision, particularly as architectures become more complex. This work explores the use of large language models (LLMs) for hyperparameter optimization by fine-tuning a parameter-efficient version of Code Llama using LoRA. The resulting model produces accurate and computationally efficient hyperparameter recommendations across a wide range of vision architectures. Unlike traditional methods such as Optuna, which rely on resource-intensive trial-and-error procedures, our approach achieves competitive or superior Root Mean Square Error (RMSE) while substantially reducing computational overhead. Importantly, the models evaluated span image-centric tasks such as classification, detection, and segmentation, fundamental components in many image manipulation pipelines including enhancement, restoration, and style transfer . Our results demonstrate that LLM-based optimization not only rivals established Bayesian methods like Tree-structured Parzen Estimators (TPE), but also accelerates tuning for real-world applications requiring perceptual quality and low-latency processing.


OptiMindTune: A Multi-Agent Framework for Intelligent Hyperparameter Optimization

arXiv.org Artificial Intelligence

Hyperparameter optimization (HPO) is a critical yet challenging aspect of machine learning model development, significantly impacting model performance and generalization. Traditional HPO methods often struggle with high dimensionality, complex interdependencies, and computational expense. This paper introduces OptiMindTune, a novel multi-agent framework designed to intelligently and efficiently optimize hyperparameters. OptiMindTune leverages the collaborative intelligence of three specialized AI agents -- a Recommender Agent, an Evaluator Agent, and a Decision Agent -- each powered by Google's Gemini models. These agents address distinct facets of the HPO problem, from model selection and hyperparameter suggestion to robust evaluation and strategic decision-making. By fostering dynamic interactions and knowledge sharing, OptiMindTune aims to converge to optimal hyperparameter configurations more rapidly and robustly than existing single-agent or monolithic approaches. Our framework integrates principles from advanced large language models, and adaptive search to achieve scalable and intelligent AutoML. We posit that this multi-agent paradigm offers a promising avenue for tackling the increasing complexity of modern machine learning model tuning.


Review for NeurIPS paper: PyGlove: Symbolic Programming for Automated Machine Learning

Neural Information Processing Systems

Summary and Contributions: The paper introduces an AutoML library that tries to find its own sweet spot in the large ecosystem of newly minted AutoML libraries. The paper introduces a symbolic frontend to build neural network models, with simple fundamental constructs that provide choice insertions. Unlike all other packages that I have seen and reviewed, such as Keras Tuner, NNI, AutoGluon, Optuna (btw reference missing to Optuna, you should consider adding), this paper introduces something innovative and elegant. All these other packages consistently suffer from the code of the model definition getting ugly and unweildy really quickly when you have to introduce model structure searches, and when there's interaction between structure searches and size searches. In this paper, the authors cleanly separate model structure definitions from each layer's hyperparameter choices.


TBBC: Predict True Bacteraemia in Blood Cultures via Deep Learning

arXiv.org Artificial Intelligence

Bacteraemia, a bloodstream infection with high morbidity and mortality rates, poses significant diagnostic challenges. Accurate diagnosis through blood cultures is resource-intensive. Developing a machine learning model to predict blood culture outcomes in emergency departments offers potential for improved diagnosis, reduced healthcare costs, and mitigated antibiotic use.This thesis aims to identify optimal machine learning techniques for predicting bacteraemia and develop a predictive model using data from St. Antonius Hospital's emergency department. Based on current literature, CatBoost and Random Forest were selected as the most promising techniques. Model optimization using Optuna prioritized sensitivity.The final Random Forest model achieved an ROC AUC of 0.78 and demonstrated 0.92 sensitivity on the test set. Notably, it accurately identified 36.02% of patients at low risk of bacteraemia, with only 0.85% false negatives.Implementation of this model in St. Antonius Hospital's emergency department could reduce blood cultures, healthcare costs, and antibiotic treatments. Future studies should focus on external validation, exploring advanced techniques, and addressing potential confounders to ensure model generalizability.


Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

arXiv.org Artificial Intelligence

Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved state-of-the-art performance in learning equilibrium policies of two-player zero-sum games. However, the hand-crafted hyperparameter value selection in most of the existing works requires extensive domain knowledge, forming the main barrier to applying PSRO to different games. In this work, we make the first attempt to investigate the possibility of self-adaptively determining the optimal hyperparameter values in the PSRO framework. Our contributions are three-fold: (1) Using several hyperparameters, we propose a parametric PSRO that unifies the gradient descent ascent (GDA) and different PSRO variants. (2) We propose the self-adaptive PSRO (SPSRO) by casting the hyperparameter value selection of the parametric PSRO as a hyperparameter optimization (HPO) problem where our objective is to learn an HPO policy that can self-adaptively determine the optimal hyperparameter values during the running of the parametric PSRO. (3) To overcome the poor performance of online HPO methods, we propose a novel offline HPO approach to optimize the HPO policy based on the Transformer architecture. Experiments on various two-player zero-sum games demonstrate the superiority of SPSRO over different baselines.


OptBA: Optimizing Hyperparameters with the Bees Algorithm for Improved Medical Text Classification

arXiv.org Artificial Intelligence

One of the challenges that artificial intelligence engineers face, specifically in the field of deep learning is obtaining the optimal model hyperparameters. The search for optimal hyperparameters usually hinders the progress of solutions to real-world problems such as healthcare. To overcome this hurdle, the proposed work introduces a novel mechanism called ``OptBA" to automatically fine-tune the hyperparameters of deep learning models by leveraging the Bees Algorithm, which is a recent promising swarm intelligence algorithm. In this paper, the optimization problem of OptBA is to maximize the accuracy in classifying ailments using medical text, where initial hyperparameters are iteratively adjusted by specific criteria. Experimental results demonstrate a noteworthy enhancement in accuracy with approximately 1.4%. This outcome highlights the effectiveness of the proposed mechanism in addressing the critical issue of hyperparameter optimization and its potential impact on advancing solutions for healthcare and other societal challenges.