Optimization
Robust Trajectory Generation and Control for Quadrotor Motion Planning with Field-of-View Control Barrier Certification
Pan, Lishuo, Catellani, Mattia, Sabattini, Lorenzo, Ayanian, Nora
Many approaches to multi-robot coordination are susceptible to failure due to communication loss and uncertainty in estimation. We present a real-time communication-free distributed algorithm for navigating robots to their desired goals certified by control barrier functions, that model and control the onboard sensing behavior to keep neighbors in the limited field of view for position estimation. The approach is robust to temporary tracking loss and directly synthesizes control in real time to stabilize visual contact through control Lyapunov-barrier functions. The main contributions of this paper are a continuous-time robust trajectory generation and control method certified by control barrier functions for distributed multi-robot systems and a discrete optimization procedure, namely, MPC-CBF, to approximate the certified controller. In addition, we propose a linear surrogate of high-order control barrier function constraints and use sequential quadratic programming to solve MPC-CBF efficiently. We demonstrate results in simulation with 10 robots and physical experiments with 2 custom-built UAVs. To the best of our knowledge, this work is the first of its kind to generate a robust continuous-time trajectory and controller concurrently, certified by control barrier functions utilizing piecewise splines.
eagle: early approximated gradient based learning rate estimator
Fujimoto, Takumi, Nishi, Hiroaki
We propose EAGLE update rule, a novel optimization method that accelerates loss convergence during the early stages of training by leveraging both current and previous step parameter and gradient values. The update algorithm estimates optimal parameters by computing the changes in parameters and gradients between consecutive training steps and leveraging the local curvature of the loss landscape derived from these changes. However, this update rule has potential instability, and to address that, we introduce an adaptive switching mechanism that dynamically selects between Adam and EAGLE update rules to enhance training stability. Experiments on standard benchmark datasets demonstrate that EAGLE optimizer, which combines this novel update rule with the switching mechanism achieves rapid training loss convergence with fewer epochs, compared to conventional optimization methods.
Refining Adaptive Zeroth-Order Optimization at Ease
Shu, Yao, Zhang, Qixin, He, Kun, Dai, Zhongxiang
Recently, zeroth-order (ZO) optimization plays an essential role in scenarios where gradient information is inaccessible or unaffordable, such as black-box systems and resource-constrained environments. While existing adaptive methods such as ZO-AdaMM have shown promise, they are fundamentally limited by their underutilization of moment information during optimization, usually resulting in underperforming convergence. To overcome these limitations, this paper introduces Refined Adaptive Zeroth-Order Optimization (R-AdaZO). Specifically, we first show the untapped variance reduction effect of first moment estimate on ZO gradient estimation, which improves the accuracy and stability of ZO updates. We then refine the second moment estimate based on these variance-reduced gradient estimates to better capture the geometry of the optimization landscape, enabling a more effective scaling of ZO updates. We present rigorous theoretical analysis to show (I) the first analysis to the variance reduction of first moment estimate in ZO optimization, (II) the improved second moment estimates with a more accurate approximation of its variance-free ideal, (III) the first variance-aware convergence framework for adaptive ZO methods, which may be of independent interest, and (IV) the faster convergence of R-AdaZO than existing baselines like ZO-AdaMM. Our extensive experiments, including synthetic problems, black-box adversarial attack, and memory-efficient fine-tuning of large language models (LLMs), further verify the superior convergence of R-AdaZO, indicating that R-AdaZO offers an improved solution for real-world ZO optimization challenges.
Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization
Sieradzki, Szymon, Maลdziuk, Jacek
In this paper we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks. In machine learning, the performance of a model heavily depends on the correct choice of hyperparameters, such as the learning rate, the number of layers in a neural network, or specific regularization techniques. These hyperparameters form a multidimensional space where some dimensions are continuous (e.g., the learning rate), while others are discrete (e.g., the number of network layers). The task of Hyperparameter Optimization (HPO) aims to find the best combination of these hyperparameters by searching this space in a way that optimizes a predefined objective function. In supervised learning, this function is usually a loss function, which quantifies an error between the predictions of the model and the true values. HPO is applicable across a wide range of machine learning models, as most optimization techniques are agnostic to the underlying model type. The core requirement for any HPO algorithm is to define the hyperparameter space and the objective function. However, HPO presents specific challenges that separate it from other optimization problems, making it a unique area in the field. Each evaluation of the objective function requires training the considered machine learning model from scratch, which is often the most time-consuming part of the optimization process. As a result, when designing HPO algorithms, the focus is less on the internal computational efficiency of the optimizer but rather on minimizing the number of objective function evaluations, while still maintaining good predictive performance.
High-Dimensional Bayesian Optimization Using Both Random and Supervised Embeddings
Priem, Rรฉmy, Diouane, Youssef, Bartoli, Nathalie, Dubreuil, Sylvain, Saves, Paul
Bayesian optimization (BO) is one of the most powerful strategies to solve computationally expensive-to-evaluate blackbox optimization problems. However, BO methods are conventionally used for optimization problems of small dimension because of the curse of dimensionality. In this paper, a high-dimensionnal optimization method incorporating linear embedding subspaces of small dimension is proposed to efficiently perform the optimization. An adaptive learning strategy for these linear embeddings is carried out in conjunction with the optimization. The resulting BO method, named efficient global optimization coupled with random and supervised embedding (EGORSE), combines in an adaptive way both random and supervised linear embeddings. EGORSE has been compared to state-of-the-art algorithms and tested on academic examples with a number of design variables ranging from 10 to 600. The obtained results show the high potential of EGORSE to solve high-dimensional blackbox optimization problems, in terms of both CPU time and the limited number of calls to the expensive blackbox simulation.
DiffIM: Differentiable Influence Minimization with Surrogate Modeling and Continuous Relaxation
Lee, Junghun, Kim, Hyunju, Bu, Fanchen, Ko, Jihoon, Shin, Kijung
In social networks, people influence each other through social links, which can be represented as propagation among nodes in graphs. Influence minimization (IMIN) is the problem of manipulating the structures of an input graph (e.g., removing edges) to reduce the propagation among nodes. IMIN can represent time-critical real-world applications, such as rumor blocking, but IMIN is theoretically difficult and computationally expensive. Moreover, the discrete nature of IMIN hinders the usage of powerful machine learning techniques, which requires differentiable computation. In this work, we propose DiffIM, a novel method for IMIN with two differentiable schemes for acceleration: (1) surrogate modeling for efficient influence estimation, which avoids time-consuming simulations (e.g., Monte Carlo), and (2) the continuous relaxation of decisions, which avoids the evaluation of individual discrete decisions (e.g., removing an edge). We further propose a third accelerating scheme, gradient-driven selection, that chooses edges instantly based on gradients without optimization (spec., gradient descent iterations) on each test instance. Through extensive experiments on real-world graphs, we show that each proposed scheme significantly improves speed with little (or even no) IMIN performance degradation. Our method is Pareto-optimal (i.e., no baseline is faster and more effective than it) and typically several orders of magnitude (spec., up to 15,160X) faster than the most effective baseline while being more effective.
Mirror Descent Under Generalized Smoothness
Yu, Dingzhi, Jiang, Wei, Wan, Yuanyu, Zhang, Lijun
Smoothness is crucial for attaining fast rates in first-order optimization. However, many optimization problems in modern machine learning involve non-smooth objectives. Recent studies relax the smoothness assumption by allowing the Lipschitz constant of the gradient to grow with respect to the gradient norm, which accommodates a broad range of objectives in practice. Despite this progress, existing generalizations of smoothness are restricted to Euclidean geometry with $\ell_2$-norm and only have theoretical guarantees for optimization in the Euclidean space. In this paper, we address this limitation by introducing a new $\ell*$-smoothness concept that measures the norm of Hessian in terms of a general norm and its dual, and establish convergence for mirror-descent-type algorithms, matching the rates under the classic smoothness. Notably, we propose a generalized self-bounding property that facilitates bounding the gradients via controlling suboptimality gaps, serving as a principal component for convergence analysis. Beyond deterministic optimization, we establish an anytime convergence for stochastic mirror descent based on a new bounded noise condition that encompasses the widely adopted bounded or affine noise assumptions.
Dual Alignment Maximin Optimization for Offline Model-based RL
Zhou, Chi, Luo, Wang, Li, Haoran, Han, Congying, Guo, Tiande, Zhang, Zicheng
Offline reinforcement learning agents face significant deployment challenges due to the synthetic-to-real distribution mismatch. While most prior research has focused on improving the fidelity of synthetic sampling and incorporating off-policy mechanisms, the directly integrated paradigm often fails to ensure consistent policy behavior in biased models and underlying environmental dynamics, which inherently arise from discrepancies between behavior and learning policies. In this paper, we first shift the focus from model reliability to policy discrepancies while optimizing for expected returns, and then self-consistently incorporate synthetic data, deriving a novel actor-critic paradigm, Dual Alignment Maximin Optimization (DAMO). It is a unified framework to ensure both model-environment policy consistency and synthetic and offline data compatibility. The inner minimization performs dual conservative value estimation, aligning policies and trajectories to avoid out-of-distribution states and actions, while the outer maximization ensures that policy improvements remain consistent with inner value estimates. Empirical evaluations demonstrate that DAMO effectively ensures model and policy alignments, achieving competitive performance across diverse benchmark tasks.
Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies
Cao, Yuefan, Li, Xiaoyu, Liang, Yingyu, Sha, Zhizhou, Shi, Zhenmei, Song, Zhao, Zhang, Jiahao
As AI research surges in both impact and volume, conferences have imposed submission limits to maintain paper quality and alleviate organizational pressure. In this work, we examine the fairness of desk-rejection systems under submission limits and reveal that existing practices can result in substantial inequities. Specifically, we formally define the paper submission limit problem and identify a critical dilemma: when the number of authors exceeds three, it becomes impossible to reject papers solely based on excessive submissions without negatively impacting innocent authors. Thus, this issue may unfairly affect early-career researchers, as their submissions may be penalized due to co-authors with significantly higher submission counts, while senior researchers with numerous papers face minimal consequences. To address this, we propose an optimization-based fairness-aware desk-rejection mechanism and formally define two fairness metrics: individual fairness and group fairness. We prove that optimizing individual fairness is NP-hard, whereas group fairness can be efficiently optimized via linear programming. Through case studies, we demonstrate that our proposed system ensures greater equity than existing methods, including those used in CVPR 2025, offering a more socially just approach to managing excessive submissions in AI conferences.
An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions
Fernandez-Bes, Jesus, Cid-Sueiro, Jesus, Marques, Antonio G.
In this paper, we propose a novel censoring policy for energy-efficient transmissions in energy-harvesting sensors. The problem is formulated as an infinite-horizon Markov Decision Process (MDP). The objective to be optimized is the expected sum of the importance (utility) of all transmitted messages. Assuming that such importance can be evaluated at the transmitting node, we show that, under certain conditions on the battery model, the optimal censoring policy is a threshold function on the importance value. Specifically, messages are transmitted only if their importance is above a threshold whose value depends on the battery level. Exploiting this property, we propose a model-based stochastic scheme that approximates the optimal solution, with less computational complexity and faster convergence speed than a conventional Q-learning algorithm. Numerical experiments in single-hop and multi-hop networks confirm the analytical advantages of the proposed scheme.