AITopics

In certain real-world optimization scenarios, practitioners are not interested in solving multiple problems but rather in finding the best solution to a single, specific problem. When the computational budget is large relative to the cost of evaluating a candidate solution, multiple heuristic alternatives can be tried to solve the same given problem, each possibly with a different algorithm, parameter configuration, initialization, or stopping criterion. The sequential selection of which alternative to try next is crucial for efficiently identifying the one that provides the best possible solution across multiple attempts. Despite the relevance of this problem in practice, it has not yet been the exclusive focus of any existing review. Several sequential alternative selection strategies have been proposed in different research topics, but they have not been comprehensively and systematically unified under a common perspective. This work presents a focused review of single-problem multi-attempt heuristic optimization. It brings together suitable strategies to this problem that have been studied separately through algorithm selection, parameter tuning, multi-start and resource allocation. These strategies are explained using a unified terminology within a common framework, which supports the development of a taxonomy for systematically organizing and classifying them.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

2509.26321

Country:

North America > United States > Massachusetts (0.28)
Europe > United Kingdom > Scotland (0.28)

Genre:

Overview (1.00)
Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Transportation (0.68)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Jha, Animesh, Gupta, Harshit, Nandi, Ananjan

RL-Guided Data Selection for Language Model Finetuning

Data selection for finetuning Large Language Models (LLMs) can be framed as a budget-constrained optimization problem: maximizing a model's downstream performance under a strict training data budget. Solving this problem is generally intractable, and existing approximate approaches are pretraining-oriented and transfer poorly to the fine-tuning setting. We reformulate this problem as a tractable Markov Decision Process (MDP) and train agents using various Reinforcement Learning (RL) methods to learn optimal data selection policies, guided by an efficient, proxy-model-based reward signal. Across four datasets, training on a $5\%$ subset selected by our approach matches or outperforms fine-tuning on the full dataset by up to $10.8$ accuracy points, while cutting wall-clock training time by up to $2 \times$, highlighting the promise of RL-guided data selection.

large language model, machine learning, reinforcement learning, (21 more...)

2509.2585

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)

Kimiaei, Morteza, Kungurtsev, Vyacheslav

Machine Learning Algorithms for Improving Black Box Optimization Solvers

Black-box optimization (BBO) addresses problems where objectives are accessible only through costly queries without gradients or explicit structure. Classical derivative-free methods -- line search, direct search, and model-based solvers such as Bayesian optimization -- form the backbone of BBO, yet often struggle in high-dimensional, noisy, or mixed-integer settings. Recent advances use machine learning (ML) and reinforcement learning (RL) to enhance BBO: ML provides expressive surrogates, adaptive updates, meta-learning portfolios, and generative models, while RL enables dynamic operator configuration, robustness, and meta-optimization across tasks. This paper surveys these developments, covering representative algorithms such as NNs with the modular model-based optimization framework (mlrMBO), zeroth-order adaptive momentum methods (ZO-AdaMM), automated BBO (ABBO), distributed block-wise optimization (DiBB), partition-based Bayesian optimization (SPBOpt), the transformer-based optimizer (B2Opt), diffusion-model-based BBO, surrogate-assisted RL for differential evolution (Surr-RLDE), robust BBO (RBO), coordinate-ascent model-based optimization with relative entropy (CAS-MORE), log-barrier stochastic gradient descent (LB-SGD), policy improvement with black-box (PIBB), and offline Q-learning with Mamba backbones (Q-Mamba). We also review benchmark efforts such as the NeurIPS 2020 BBO Challenge and the MetaBox framework. Overall, we highlight how ML and RL transform classical inexact solvers into more scalable, robust, and adaptive frameworks for real-world optimization.

evolutionary algorithm, machine learning, reinforcement learning, (19 more...)

2509.25592

Country:

North America > United States (1.00)
Europe (0.67)

Genre:

Overview (0.87)
Research Report (0.63)

Industry:

Transportation > Air (0.83)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Shah, Rishi Ashish, Dhondiyal, Shivaay, Sharma, Kartik, Talwar, Sukriti, Jain, Saksham, Jain, Sparsh

GESA: Graph-Enhanced Semantic Allocation for Generalized, Fair, and Explainable Candidate-Role Matching

Abstract--Accurate, fair, and explainable allocation of candidates to roles represents a fundamental challenge across multiple domains including corporate hiring, academic admissions, fellowship awards, and volunteer placement systems. Current state-of-the-art approaches suffer from semantic inflexibility, persistent demographic bias, opacity in decision-making processes, and poor scalability under dynamic policy constraints. Our experimental evaluation on large-scale international benchmarks comprising 20,000 candidate profiles and 3,000 role specifications demonstrates superior performance with 94.5% top-3 allocation accuracy, 37% improvement in diversity representation, 0.98 fairness score across demographic categories, and sub-second end-to-end latency. Additionally, GESA incorporates hybrid recommendation capabilities and glass-box explainability, making it suitable for deployment across diverse international contexts in industry, academia, and non-profit sectors. The problem of matching candidates to appropriate roles efficiently and fairly represents one of the most critical challenges in modern organizational and institutional decision-making processes. This challenge spans multiple domains: corporate talent acquisition where companies struggle to identify optimal candidates from thousands of applications [1], academic admissions where universities must select students who will thrive in specific programs [2], research fellowship allocation where funding bodies need to match candidates with projects [3], and volunteer placement systems where non-profit organizations seek to optimize volunteer-task assignments [4]. Despite decades of research and development, existing allocation systems continue to exhibit fundamental limitations that significantly impact their effectiveness and fairness. First, semantic inflexibility remains a persistent issue--traditional keyword-based and static embedding approaches fail to capture the nuanced contextual relationships between candidate qualifications and role requirements [5].

artificial intelligence, machine learning, natural language, (19 more...)

2509.25435

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Architecture (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
(4 more...)

Heterogeneous Multi-agent Collaboration in UAV-assisted Mobile Crowdsensing Networks

Deng, Xianyang, Liu, Wenshuai, FuB, Yaru, Zhu, Qi

Unmanned aerial vehicles (UAVs)-assisted mobile crowdsensing (MCS) has emerged as a promising paradigm for data collection. However, challenges such as spectrum scarcity, device heterogeneity, and user mobility hinder efficient coordination of sensing, communication, and computation. To tackle these issues, we propose a joint optimization framework that integrates time slot partition for sensing, communication, and computation phases, resource allocation, and UAV 3D trajectory planning, aiming to maximize the amount of processed sensing data. The problem is formulated as a non-convex stochastic optimization and further modeled as a partially observable Markov decision process (POMDP) that can be solved by multi-agent deep reinforcement learning (MADRL) algorithm. To overcome the limitations of conventional multi-layer perceptron (MLP) networks, we design a novel MADRL algorithm with hybrid actor network. The newly developed method is based on heterogeneous agent proximal policy optimization (HAPPO), empowered by convolutional neural networks (CNN) for feature extraction and Kolmogorov-Arnold networks (KAN) to capture structured state-action dependencies. Extensive numerical results demonstrate that our proposed method achieves significant improvements in the amount of processed sensing data when compared with other benchmarks.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2509.25261

Country: Asia > China (0.46)

Genre: Research Report (0.70)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Wang, Yanke, Flouris, Kyriakos

Hyperbolic Optimization

This work explores optimization methods on hyperbolic manifolds. Building on Riemannian optimization principles, we extend the Hyperbolic Stochastic Gradient Descent (a specialization of Riemannian SGD) to a Hyperbolic Adam optimizer. While these methods are particularly relevant for learning on the Poincaré ball, they may also provide benefits in Euclidean and other non-Euclidean settings, as the chosen optimization encourages the learning of Poincaré embeddings. This representation, in turn, accelerates convergence in the early stages of training, when parameters are far from the optimum. As a case study, we train diffusion models using the hyperbolic optimization methods with hyperbolic time-discretization of the Langevin dynamics, and show that they achieve faster convergence on certain datasets without sacrificing generative quality.

artificial intelligence, machine learning, optimization problem, (13 more...)

2509.25206

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

OrthAlign: Orthogonal Subspace Decomposition for Non-Interfering Multi-Objective Alignment

Lin, Liang, Xu, Zhihao, Dong, Junhao, Zhao, Jian, Yuan, Yuchen, Zhang, Guibin, Yu, Miao, Zhang, Yiming, Yao, Zhengtao, Yi, Huahui, Liu, Dongrui, Li, Xinfeng, Wang, Kun

Large language model (LLM) alignment faces a critical dilemma when addressing multiple human preferences: improvements in one dimension frequently come at the expense of others, creating unavoidable trade-offs between competing objectives like helpfulness and harmlessness. While prior work mainly focuses on constraint-based optimization algorithms and data selection strategies to mitigate conflicts, these approaches overlook the fundamental issue of resolving conflicts directly at the parameter level. In this paper, we present OrthAlign, an innovative approach that pioneers a new paradigm by leveraging orthogonal subspace decomposition to fundamentally resolve gradient-level conflicts in multi-objective preference alignment. OrthAlign strategically decomposes parameter update spaces into orthogonal subspaces, ensuring that optimization toward different preferences occurs in mathematically non-interfering directions. Building upon this, we provide theoretical guarantees demonstrating that when parameter increments satisfy both orthogonal subspace constraints and spectral norm bounds, the resulting updates exhibit linear Lipschitz growth rather than exponential instability, ensuring stable convergence across all preference dimensions. Extensive experiments show that: I. OrthAlign achieves maximum single-preference improvements ranging from 34.61% to 50.89% after multiple-objective alignment across helpful, harmless, and truthful dimensions. II. With an average overall reward improvement of 13.96%.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2509.2461

Country: Asia (0.29)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Law Enforcement & Public Safety (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Experience-Guided Reflective Co-Evolution of Prompts and Heuristics for Automatic Algorithm Design

Liu, Yihong, Li, Junyi, Zhao, Wayne Xin, Lu, Hongyu, Wen, Ji-Rong

Combinatorial optimization problems are traditionally tackled with handcrafted heuristic algorithms, which demand extensive domain expertise and significant implementation effort. Recent progress has highlighted the potential of automatic heuristics design powered by large language models (LLMs), enabling the automatic generation and refinement of heuristics. These approaches typically maintain a population of heuristics and employ LLMs as mutation operators to evolve them across generations. While effective, such methods often risk stagnating in local optima. To address this issue, we propose the Experience-Guided Reflective Co-Evolution of Prompt and Heuristics (EvoPH) for automatic algorithm design, a novel framework that integrates the island migration model with the elites selection algorithm to simulate diverse heuristics populations. In EvoPH, prompts are co-evolved with heuristic algorithms, guided by performance feedback. We evaluate our framework on two problems, i.e., Traveling Salesman Problem and Bin Packing Problem. Experimental results demonstrate that EvoPH achieves the lowest relative error against optimal solutions across both datasets, advancing the field of automatic algorithm design with LLMs.

evolutionary algorithm, large language model, machine learning, (18 more...)

2509.24509

Country: Asia (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

FMIP: Joint Continuous-Integer Flow For Mixed-Integer Linear Programming

Li, Hongpei, Yuan, Hui, Zhang, Han, Lin, Jianghao, Ge, Dongdong, Wang, Mengdi, Ye, Yinyu

Mixed-Integer Linear Programming (MILP) is a foundational tool for complex decision-making problems. However, the NP-hard nature of MILP presents a significant computational challenge, motivating the development of machine learning-based heuristic solutions to accelerate downstream solvers. While recent generative models have shown promise in learning powerful heuristics, they suffer from a critical limitation. That is, they model the distribution of only the integer variables and fail to capture the intricate coupling between integer and continuous variables, creating an information bottleneck and ultimately leading to suboptimal solutions. To this end, we propose Joint Continuous-Integer Flow for Mixed-Integer Linear Programming (FMIP), which is the first generative framework that models the joint distribution of both integer and continuous variables for MILP solutions. Built upon the joint modeling paradigm, a holistic guidance mechanism is designed to steer the generative trajectory, actively refining solutions toward optimality and feasibility during the inference process. Extensive experiments on eight standard MILP benchmarks demonstrate the superior performance of FMIP against existing baselines, reducing the primal gap by 41.34% on average. Moreover, we show that FMIP is fully compatible with arbitrary backbone networks and various downstream solvers, making it well-suited for a broad range of real-world MILP applications.

artificial intelligence, fmip, optimization problem, (15 more...)

2507.2339

Country: Asia (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Kundu, Akash, Mangini, Stefano

TensorRL-QAS: Reinforcement learning with tensor networks for improved quantum architecture search

Variational quantum algorithms hold the promise to address meaningful quantum problems already on noisy intermediate-scale quantum hardware. In spite of the promise, they face the challenge of designing quantum circuits that both solve the target problem and comply with device limitations. Quantum architecture search (QAS) automates the design process of quantum circuits, with reinforcement learning (RL) emerging as a promising approach. Yet, RL-based QAS methods encounter significant scalability issues, as computational and training costs grow rapidly with the number of qubits, circuit depth, and hardware noise. To address these challenges, we introduce $\textit{TensorRL-QAS}$, an improved framework that combines tensor network methods with RL for QAS. By warm-starting the QAS with a matrix product state approximation of the target solution, TensorRL-QAS effectively narrows the search space to physically meaningful circuits and accelerates the convergence to the desired solution. Tested on several quantum chemistry problems of up to 12-qubit, TensorRL-QAS achieves up to a 10-fold reduction in CNOT count and circuit depth compared to baseline methods, while maintaining or surpassing chemical accuracy. It reduces classical optimizer function evaluation by up to 100-fold, accelerates training episodes by up to 98$\%$, and can achieve 50$\%$ success probability for 10-qubit systems, far exceeding the $<$1$\%$ rates of baseline. Robustness and versatility are demonstrated both in the noiseless and noisy scenarios, where we report a simulation of an 8-qubit system. Furthermore, TensorRL-QAS demonstrates effectiveness on systems on 20-qubit quantum systems, positioning it as a state-of-the-art quantum circuit discovery framework for near-term hardware and beyond.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2505.09371

Country: Europe (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)