Optimization
FACTORS: Factorial Approximation for Complementary Two-factor Optimization with Risk-aware Scoring
Kim, Dongseok, Jeong, Wonjun, Oh, Gisung
We propose FACTORS, a framework that combines design of experiments with Shapley decomposition to address performance and stability issues that are sensitive to combinations of training factors. Our approach consistently estimates main effects and two-factor interactions, then integrates them into a risk-adjusted objective function that jointly accounts for uncertainty and cost, enabling reliable selection of configurations under a fixed budget. Effect estimation is implemented through two complementary paths: a plug-in path based on conditional means, and a least-squares path that reconstructs Shapley contributions from samples. These paths are designed to work complementarily even when design density and bias levels differ. By incorporating standardization of estimates, bias correction, and uncertainty quantification, our procedure ensures comparability across heterogeneous factor spaces and designs, while a lightweight search routine yields configurations within practical time even for large factor spaces. On the theoretical side, we provide error decompositions, sample complexity analysis, and upper bounds on optimality gaps. On the interpretive side, we summarize main effects and interactions in map form, highlighting adjustment priorities and safe improvement pathways. Across diverse datasets and design conditions, our approach improves rank preservation and optimal configuration identification, reduces decision-making risks, and offers a tuning foundation that delivers interpretable justification alongside stable performance gains even under budget constraints.
Learning Concave Bid Shading Strategies in Online Auctions via Measure-valued Proximal Optimization
Nodozi, Iman, Gligorijevic, Djordje, Halder, Abhishek
This work proposes a bid shading strategy for first-price auctions as a measure-valued optimization problem. We consider a standard parametric form for bid shading and formulate the problem as convex optimization over the joint distribution of shading parameters. After each auction, the shading parameter distribution is adapted via a regularized Wasserstein-proximal update with a data-driven energy functional. This energy functional is conditional on the context, i.e., on publisher/user attributes such as domain, ad slot type, device, or location. The proposed algorithm encourages the bid distribution to place more weight on values with higher expected surplus, i.e., where the win probability and the value gap are both large. We show that the resulting measure-valued convex optimization problem admits a closed form solution. A numerical example illustrates the proposed method.
Control Barrier Functions via Minkowski Operations for Safe Navigation among Polytopic Sets
Chen, Yi-Hsuan, Liu, Shuo, Xiao, Wei, Belta, Calin, Otte, Michael
Safely navigating around obstacles while respecting the dynamics, control, and geometry of the underlying system is a key challenge in robotics. Control Barrier Functions (CBFs) generate safe control policies by considering system dynamics and geometry when calculating safe forward-invariant sets. Existing CBF-based methods often rely on conservative shape approximations, like spheres or ellipsoids, which have explicit and differentiable distance functions. In this paper, we propose an optimization-defined CBF that directly considers the exact Signed Distance Function (SDF) between a polytopic robot and polytopic obstacles. Inspired by the Gilbert-Johnson-Keerthi (GJK) algorithm, we formulate both (i) minimum distance and (ii) penetration depth between polytopic sets as convex optimization problems in the space of Minkowski difference operations (the MD-space). Convenient geometric properties of the MD-space enable the derivatives of implicit SDF between two polytopes to be computed via differentiable optimization. We demonstrate the proposed framework in three scenarios including pure translation, initialization inside an unsafe set, and multi-obstacle avoidance. These three scenarios highlight the generation of a non-conservative maneuver, a recovery after starting in collision, and the consideration of multiple obstacles via pairwise CBF constraint, respectively.
Imitation Learning as Return Distribution Matching
Lazzati, Filippo, Metelli, Alberto Maria
We study the problem of training a risk-sensitive reinforcement learning (RL) agent through imitation learning (IL). Unlike standard IL, our goal is not only to train an agent that matches the expert's expected return (i.e., its average performance) but also its risk attitude (i.e., other features of the return distribution, such as variance). We propose a general formulation of the risk-sensitive IL problem in which the objective is to match the expert's return distribution in Wasserstein distance. We focus on the tabular setting and assume the expert's reward is known. After demonstrating the limited expressivity of Markovian policies for this task, we introduce an efficient and sufficiently expressive subclass of non-Markovian policies tailored to it. Building on this subclass, we develop two provably efficient algorithms, RS-BC and RS-KT, for solving the problem when the transition model is unknown and known, respectively. We show that RS-KT achieves substantially lower sample complexity than RS-BC by exploiting dynamics information. We further demonstrate the sample efficiency of return distribution matching in the setting where the expert's reward is unknown by designing an oracle-based variant of RS-KT. Finally, we complement our theoretical analysis of RS-KT and RS-BC with numerical simulations, highlighting both their sample efficiency and the advantages of non-Markovian policies over standard sample-efficient IL algorithms.
AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
Lee, Sangjun, Woo, Seung-taek, Jin, Jungyu, Lee, Changhun, Park, Eunhyeok
To enable broader deployment of Large Language Models (LLMs), it is essential to identify the best-performing model under strict memory constraints. We present AMQ, Automated Mixed-Precision Weight-Only Quantization, a framework that assigns layer-wise quantization bit-widths to optimally balance model quality and memory usage. However, the combinatorial search space, with over 10^{100} possible configurations, makes conventional black-box optimization infeasible. AMQ overcomes this challenge through four key innovations:(1) search space pruning using prior knowledge to exclude unpromising configurations, (2) quantization proxy to bypass costly format conversions during search, (3) quality predictor to minimize evaluation overhead, and (4) iterative search-and-update strategy for fast and stable convergence. By integrating these components, AMQ efficiently explores the quality-efficiency landscape, reaching the Pareto frontier and yielding LLMs that are both compact and high-performing. Our code is available at https://github.com/dlwns147/amq.
A learning-driven automatic planning framework for proton PBS treatments of H&N cancers
Wang, Qingqing, Xiao, Liqiang, Chang, Chang
Proton pencil beam scanning (PBS) treatment planning for head & neck (H&N) cancers involves numerous conflicting objectives, requiring iterative objective parameter adjustments to balance multiple clinical goals. We propose a learning-driven inverse optimizer and integrate it into a proximal policy optimization (PPO)-based planning framework to automatically generate high-quality plans for patients with diverse treatment requirements. The inverse optimizer is a learning-to-optimize (L2O) method that predicts update steps by learning from task-specific data distributions. For the first time, long-context processing techniques developed for large language models (LLMs) are utilized to address the scalability limitations of existing L2O methods, enabling simultaneous optimization over a substantially large set of variables. The PPO framework functions as an outer-loop virtual planner, autonomously adjusting objective parameters through a policy network, and the inner-loop L2O inverse optimizer computes machine-deliverable spot monitor unit (MU) values based on the PPO-refined objectives. Moreover, a Swin UnetR dose predictor is trained with prescription- and beam-specific information to estimate the initial objective parameters. In our experiments, total 97 patients with bilateral or ipsilateral H&N cancers are collected for training and testing. Compared with the second-order gradient-based methods, our L2O optimizer improves the effectiveness and efficiency of the time-consuming inverse optimization by 22.97% and 36.41%, respectively, and in conjunction with the PPO-based virtual planner, plans are generated within clinically acceptable times, i.e. 2.55 hours in average, and shows improved or comparable organs-at-risk sparing with superior target coverage compared with human-generated plans.
Critical Nodes Identification in Complex Networks: A Survey
Chen, Duxin, Chen, Jiawen, Zhang, Xiaoyu, Jia, Qinghan, Liu, Xiaolu, Sun, Ye, Lv, Linyuan, Yu, Wenwu
Complex networks have become essential tools for understanding diverse phenomena in social systems, traffic systems, biomolecular systems, and financial systems. Identifying critical nodes is a central theme in contemporary research, serving as a vital bridge between theoretical foundations and practical applications. Nevertheless, the intrinsic complexity and structural heterogeneity characterizing real-world networks, with particular emphasis on dynamic and higher-order networks, present substantial obstacles to the development of universal frameworks for critical node identification. This paper provides a comprehensive review of critical node identification techniques, categorizing them into seven main classes: centrality, critical nodes deletion problem, influence maximization, network control, artificial intelligence, higher-order and dynamic methods. Our review bridges the gaps in existing surveys by systematically classifying methods based on their methodological foundations and practical implications, and by highlighting their strengths, limitations, and applicability across different network types. Our work enhances the understanding of critical node research by identifying key challenges, such as algorithmic universality, real-time evaluation in dynamic networks, analysis of higher-order structures, and computational efficiency in large-scale networks. The structured synthesis consolidates current progress and highlights open questions, particularly in modeling temporal dynamics, advancing efficient algorithms, integrating machine learning approaches, and developing scalable and interpretable metrics for complex systems.
Inference-stage Adaptation-projection Strategy Adapts Diffusion Policy to Cross-manipulators Scenarios
Yao, Xiangtong, Zhou, Yirui, Meng, Yuan, Liu, Yanwen, Dong, Liangyu, Zhang, Zitao, Bing, Zhenshan, Huang, Kai, Sun, Fuchun, Knoll, Alois
Diffusion policies are powerful visuomotor models for robotic manipulation, yet they often fail to generalize to manipulators or end-effectors unseen during training and struggle to accommodate new task requirements at inference time. Addressing this typically requires costly data recollection and policy retraining for each new hardware or task configuration. To overcome this, we introduce an adaptation-projection strategy that enables a diffusion policy to perform zero-shot adaptation to novel manipulators and dynamic task settings, entirely at inference time and without any retraining. Our method first trains a diffusion policy in SE(3) space using demonstrations from a base manipulator. During online deployment, it projects the policy's generated trajectories to satisfy the kinematic and task-specific constraints imposed by the new hardware and objectives. Moreover, this projection dynamically adapts to physical differences (e.g., tool-center-point offsets, jaw widths) and task requirements (e.g., obstacle heights), ensuring robust and successful execution. We validate our approach on real-world pick-and-place, pushing, and pouring tasks across multiple manipulators, including the Franka Panda and Kuka iiwa 14, equipped with a diverse array of end-effectors like flexible grippers, Robotiq 2F/3F grippers, and various 3D-printed designs. Our results demonstrate consistently high success rates in these cross-manipulator scenarios, proving the effectiveness and practicality of our adaptation-projection strategy. The code will be released after peer review.
Compressed Sensing: Mathematical Foundations, Implementation, and Advanced Optimization Techniques
Stevenson, Shane, Sabagh, Maryam
Compressed sensing is a signal processing technique that allows for the reconstruction of a signal from a small set of measurements. The key idea behind compressed sensing is that many real-world signals are inherently sparse, meaning that they can be efficiently represented in a different space with only a few components compared to their original space representation. In this paper we will explore the mathematical formulation behind compressed sensing, its logic and pathologies, and apply compressed sensing to real world signals.
Convergence Rate in Nonlinear Two-Time-Scale Stochastic Approximation with State (Time)-Dependence
Chen, Zixi, Xu, Yumin, Zhang, Ruixun
The nonlinear two-time-scale stochastic approximation is widely studied under conditions of bounded variances in noise. Motivated by recent advances that allow for variability linked to the current state or time, we consider state- and time-dependent noises. We show that the Lyapunov function exhibits polynomial convergence rates in both cases, with the rate of polynomial delay depending on the parameters of state- or time-dependent noises. Notably, if the state noise parameters fully approach their limiting value, the Lyapunov function achieves an exponential convergence rate. We provide two numerical examples to illustrate our theoretical findings in the context of stochastic gradient descent with Polyak-Ruppert averaging and stochastic bilevel optimization.