Goto

Collaborating Authors

 parameter configuration



irace-evo: Automatic Algorithm Configuration Extended With LLM-Based Code Evolution

Sartori, Camilo Chacón, Blum, Christian

arXiv.org Artificial Intelligence

Automatic algorithm configuration tools such as irace efficiently tune parameter values but leave algorithmic code unchanged. This paper introduces a first version of irace-evo, an extension of irace that integrates code evolution through large language models (LLMs) to jointly explore parameter and code spaces. The proposed framework enables multi-language support (e.g., C++, Python), reduces token consumption via progressive context management, and employs the Always-From-Original principle to ensure robust and controlled code evolution. We evaluate irace-evo on the Construct, Merge, Solve & Adapt (CMSA) metaheuristic for the Variable-Sized Bin Packing Problem (VSBPP). Experimental results show that irace-evo can discover new algorithm variants that outperform the state-of-the-art CMSA implementation while maintaining low computational and monetary costs. Notably, irace-evo generates competitive algorithmic improvements using lightweight models (e.g., Claude Haiku 3.5) with a total usage cost under 2 euros. These results demonstrate that coupling automatic configuration with LLM-driven code evolution provides a powerful, cost-efficient avenue for advancing heuristic design and metaheuristic optimization.


A Self-Evolving AI Agent System for Climate Science

Guo, Zijie, Wang, Jiong, Ling, Fenghua, Wei, Wangxu, Yue, Xiaoyu, Jiang, Zhe, Xu, Wanghan, Luo, Jing-Jia, Cheng, Lijing, Ham, Yoo-Geun, Song, Fengfei, Gentine, Pierre, Yamagata, Toshio, Fei, Ben, Zhang, Wenlong, Gu, Xinyu, Li, Chao, Wang, Yaqiang, Chen, Tao, Ouyang, Wanli, Zhou, Bowen, Bai, Lei

arXiv.org Artificial Intelligence

Scientific progress in Earth science depends on integrating data across the planet's interconnected spheres. However, the accelerating volume and fragmentation of multi-sphere knowledge and data have surpassed human analytical capacity. This creates a major bottleneck for discovery, especially in climate science. To address this challenge, we introduce EarthLink, the first self-evolving AI agent system designed as an interactive "copilot" for Earth scientists. Through natural language interaction, EarthLink automates the entire research workflow by integrating planning, code execution, data analysis, and physical reasoning into a unified process that directly addresses this limitation. Beyond efficiency, it exhibits human-like cross-disciplinary analytical ability and achieves proficiency comparable to a junior researcher in expert evaluations on core large-scale climate tasks, including model-observation comparison and climate change understanding. When tasked with an open scientific problem, specifically the discovery of precursors of the Atlantic Niño, EarthLink autonomously developed a research strategy, identified sources of predictability, verified its hypotheses with available data, and proposed a physically consistent mechanism. These emerging capabilities enable a new human-AI research paradigm. Scientists can focus on value and result judgments, while AI systems handle complex data analysis and knowledge integration. This accelerates the pace and breadth of discovery in Earth sciences. The system is accessible at our website https://earthlink.intern-ai.org.cn.


Bayesian Optimization of Process Parameters of a Sensor-Based Sorting System using Gaussian Processes as Surrogate Models

Kronenwett, Felix, Maier, Georg, Längle, Thomas

arXiv.org Artificial Intelligence

Sensor-based sorting systems enable the physical separation of a material stream into two fractions. The sorting decision is based on the image data evaluation of the sensors used and is carried out using actuators. Various process parameters must be set depending on the properties of the material stream, the dimensioning of the system, and the required sorting accuracy. However, continuous verification and re-adjustment are necessary due to changing requirements and material stream compositions. In this paper, we introduce an approach for optimizing, recurrently monitoring and adjusting the process parameters of a sensor-based sorting system. Based on Bayesian Optimization, Gaussian process regression models are used as surrogate models to achieve specific requirements for system behavior with the uncertainties contained therein. This method minimizes the number of necessary experiments while simultaneously considering two possible optimization targets based on the requirements for both material output streams. In addition, uncertainties are considered during determining sorting accuracies in the model calculation. We evaluated the method with three example process parameters.


ClustRecNet: A Novel End-to-End Deep Learning Framework for Clustering Algorithm Recommendation

Bakhtyari, Mohammadreza, Mazoure, Bogdan, de Amorim, Renato Cordeiro, Rabusseau, Guillaume, Makarenkov, Vladimir

arXiv.org Artificial Intelligence

We introduce ClustRecNet - a novel deep learning (DL)-based recommendation framework for determining the most suitable clustering algorithms for a given dataset, addressing the long-standing challenge of clustering algorithm selection in unsupervised learning. To enable supervised learning in this context, we construct a comprehensive data repository comprising 34,000 synthetic datasets with diverse structural properties. Each of them was processed using 10 popular clustering algorithms. The resulting clusterings were assessed via the Adjusted Rand Index (ARI) to establish ground truth labels, used for training and evaluation of our DL model. The proposed network architecture integrates convolutional, residual, and attention mechanisms to capture both local and global structural patterns from the input data. This design supports end-to-end training to learn compact representations of datasets and enables direct recommendation of the most suitable clustering algorithm, reducing reliance on handcrafted meta-features and traditional Cluster Validity Indices (CVIs). Comprehensive experiments across synthetic and real-world benchmarks demonstrate that our DL model consistently outperforms conventional CVIs (e.g. Silhouette, Calinski-Harabasz, Davies-Bouldin, and Dunn) as well as state-of-the-art AutoML clustering recommendation approaches (e.g. ML2DAC, AutoCluster, and AutoML4Clust). Notably, the proposed model achieves a 0.497 ARI improvement over the Calinski-Harabasz index on synthetic data and a 15.3% ARI gain over the best-performing AutoML approach on real-world data.





A Appendix

Neural Information Processing Systems

A.1 Compute Usage The seven billion parameter language model we used as part of Frozen used model parallelism with To generate a 2-way question with n inner-shots, the following process is followed: 1. Sample two classes c "this is a dax" or "this is a blicket" accordingly 5. Select one of c Assign the truncated caption "this is a" to In 1. five distinct classes are sampled All images are stored at 224 224 resolution. To generate Real-Name miniImagenet, the same process is followed, except that in steps 4. and 6., "this is a dax"), the (first) class "this is a fruit bat"). For the evaluations in this paper, we again only take images from the validation set. In this work, we only consider 2-way Fast-VQA. To generate Guided-VQA, the same process is followed, except that in step 3. the (first) class name The Open-Ended miniImageNet, Real-Name miniImageneNet, Fast-VQA and Guided-VQA evaluations are available at https://fh295.github.io/frozen.html.


A Review on Single-Problem Multi-Attempt Heuristic Optimization

Echevarrieta, Judith, Arza, Etor, Pérez, Aritz, Ceberio, Josu

arXiv.org Artificial Intelligence

In certain real-world optimization scenarios, practitioners are not interested in solving multiple problems but rather in finding the best solution to a single, specific problem. When the computational budget is large relative to the cost of evaluating a candidate solution, multiple heuristic alternatives can be tried to solve the same given problem, each possibly with a different algorithm, parameter configuration, initialization, or stopping criterion. The sequential selection of which alternative to try next is crucial for efficiently identifying the one that provides the best possible solution across multiple attempts. Despite the relevance of this problem in practice, it has not yet been the exclusive focus of any existing review. Several sequential alternative selection strategies have been proposed in different research topics, but they have not been comprehensively and systematically unified under a common perspective. This work presents a focused review of single-problem multi-attempt heuristic optimization. It brings together suitable strategies to this problem that have been studied separately through algorithm selection, parameter tuning, multi-start and resource allocation. These strategies are explained using a unified terminology within a common framework, which supports the development of a taxonomy for systematically organizing and classifying them.