Optimization
Entropic Risk Optimization in Discounted MDPs: Sample Complexity Bounds with a Generative Model
Mortensen, Oliver, Talebi, Mohammad Sadegh
In this paper we analyze the sample complexities of learning the optimal state-action value function $Q^*$ and an optimal policy $ฯ^*$ in a discounted Markov decision process (MDP) where the agent has recursive entropic risk-preferences with risk-parameter $ฮฒ\neq 0$ and where a generative model of the MDP is available. We provide and analyze a simple model based approach which we call model-based risk-sensitive $Q$-value-iteration (MB-RS-QVI) which leads to $(ฮต,ฮด)$-PAC-bounds on $\|Q^*-Q^k\|$, and $\|V^*-V^{ฯ_k}\|$ where $Q_k$ is the output of MB-RS-QVI after k iterations and $ฯ_k$ is the greedy policy with respect to $Q_k$. Both PAC-bounds have exponential dependence on the effective horizon $\frac{1}{1-ฮณ}$ and the strength of this dependence grows with the learners risk-sensitivity $|ฮฒ|$. We also provide two lower bounds which shows that exponential dependence on $|ฮฒ|\frac{1}{1-ฮณ}$ is unavoidable in both cases. The lower bounds reveal that the PAC-bounds are both tight in $\varepsilon$ and $ฮด$ and that the PAC-bound on $Q$-learning is tight in the number of actions $A$, and that the PAC-bound on policy-learning is nearly tight in $A$.
Constrained Bayesian Optimization under Bivariate Gaussian Process with Application to Cure Process Optimization
Li, Yezhuo, Zhang, Qiong, Limaye, Madhura, Li, Gang
Bayesian Optimization, leveraging Gaussian process models, has proven to be a powerful tool for minimizing expensive-to-evaluate objective functions by efficiently exploring the search space. Extensions such as constrained Bayesian Optimization have further enhanced Bayesian Optimization's utility in practical scenarios by focusing the search within feasible regions defined by a black-box constraint function. However, constrained Bayesian Optimization in is developed based on the independence Gaussian processes assumption between objective and constraint functions, which may not hold in real-world applications. To address this issue, we use the bivariate Gaussian process model to characterize the dependence between the objective and constraint functions and developed the constrained expected improvement acquisition function under this model assumption. We show case the performance of the proposed approach with an application to cure process optimization in Manufacturing.
A condensing approach to multiple shooting neural ordinary differential equation
Prabhu, Siddharth, Rangarajan, Srinivas, Kothare, Mayuresh
Multiple-shooting is a parameter estimation approach for ordinary differential equations. In this approach, the trajectory is broken into small intervals, each of which can be integrated independently. Equality constraints are then applied to eliminate the shooting gap between the end of the previous trajectory and the start of the next trajectory. Unlike single-shooting, multiple-shooting is more stable, especially for highly oscillatory and long trajectories. In the context of neural ordinary differential equations, multiple-shooting is not widely used due to the challenge of incorporating general equality constraints. In this work, we propose a condensing-based approach to incorporate these shooting equality constraints while training a multiple-shooting neural ordinary differential equation (MS-NODE) using first-order optimization methods such as Adam.
Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States
Chien, Eli, Chen, Wei-Ning, Li, Pan
Zeroth-order optimization has emerged as a promising approach for fine-tuning large language models on domain-specific data, particularly under differential privacy (DP) and memory constraints. While first-order methods have been extensively studied from a privacy perspective, the privacy analysis and algorithmic design for zeroth-order methods remain significantly underexplored. A critical open question concerns hidden-state DP analysis: although convergent privacy bounds are known for first-order methods, it has remained unclear whether similar guarantees can be established for zeroth-order methods. In this work, we provide an affirmative answer by proving a convergent DP bound for zeroth-order optimization. Our analysis generalizes the celebrated privacy amplification-by-iteration framework to the setting of smooth loss functions in zeroth-order optimization. Furthermore, it induces better DP zeroth-order algorithmic designs that are previously unknown to the literature.
Emerging ML-AI Techniques for Analog and RF EDA
Wu, Zhengfeng, Chen, Ziyi, Achebe, Nnaemeka, Rao, Vaibhav V., Shrestha, Pratik, Savidis, Ioannis
This survey explores the integration of machine learning (ML) into EDA workflows for analog and RF circuits, addressing challenges unique to analog design, which include complex constraints, nonlinear design spaces, and high computational costs. State-of-the-art learning and optimization techniques are reviewed for circuit tasks such as constraint formulation, topology generation, device modeling, sizing, placement, and routing. The survey highlights the capability of ML to enhance automation, improve design quality, and reduce time-to-market while meeting the target specifications of an analog or RF circuit. Emerging trends and cross-cutting challenges, including robustness to variations and considerations of interconnect parasitics, are also discussed.
Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time
Chehade, Mohamad, Ghosal, Soumya Suvra, Chakraborty, Souradip, Reddy, Avinash, Manocha, Dinesh, Zhu, Hao, Bedi, Amrit Singh
Aligning large language models with humans is challenging due to the inherently multifaceted nature of preference feedback. While existing approaches typically frame this as a multi-objective optimization problem, they often overlook how humans actually make decisions. Research on bounded rationality suggests that human decision making follows satisficing strategies-optimizing primary objectives while ensuring others meet acceptable thresholds. To bridge this gap and operationalize the notion of satisficing alignment, we propose SITAlign: an inference time framework that addresses the multifaceted nature of alignment by maximizing a primary objective while satisfying threshold-based constraints on secondary criteria. We provide theoretical insights by deriving sub-optimality bounds of our satisficing based inference alignment approach. We empirically validate SITAlign's performance through extensive experimentation on multiple benchmarks. For instance, on the PKU-SafeRLHF dataset with the primary objective of maximizing helpfulness while ensuring a threshold on harmlessness, SITAlign outperforms the state-of-the-art multi objective decoding strategy by a margin of 22.3% in terms of GPT-4 win-tie rate for helpfulness reward while adhering to the threshold on harmlessness.
An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints
Zhu, Jiahui, Yu, Kihyun, Lee, Dabeen, Liu, Xin, Wei, Honghao
Online safe reinforcement learning (RL) plays a key role in dynamic environments, with applications in autonomous driving, robotics, and cybersecurity. The objective is to learn optimal policies that maximize rewards while satisfying safety constraints modeled by constrained Markov decision processes (CMDPs). Existing methods achieve sublinear regret under stochastic constraints but often fail in adversarial settings, where constraints are unknown, time-varying, and potentially adversarially designed. In this paper, we propose the Optimistic Mirror Descent Primal-Dual (OMDPD) algorithm, the first to address online CMDPs with anytime adversarial constraints. OMDPD achieves optimal regret O(sqrt(K)) and strong constraint violation O(sqrt(K)) without relying on Slater's condition or the existence of a strictly known safe policy. We further show that access to accurate estimates of rewards and transitions can further improve these bounds. Our results offer practical guarantees for safe decision-making in adversarial environments.
Learning-Augmented Algorithms for Boolean Satisfiability
Attias, Idan, Gao, Xing, Reyzin, Lev
Learning-augmented algorithms are a prominent recent development in beyond worst-case analysis. In this framework, a problem instance is provided with a prediction (``advice'') from a machine-learning oracle, which provides partial information about an optimal solution, and the goal is to design algorithms that leverage this advice to improve worst-case performance. We study the classic Boolean satisfiability (SAT) decision and optimization problems within this framework using two forms of advice. ``Subset advice" provides a random $ฮต$ fraction of the variables from an optimal assignment, whereas ``label advice" provides noisy predictions for all variables in an optimal assignment. For the decision problem $k$-SAT, by using the subset advice we accelerate the exponential running time of the PPSZ family of algorithms due to Paturi, Pudlak, Saks and Zane, which currently represent the state of the art in the worst case. We accelerate the running time by a multiplicative factor of $2^{-c}$ in the base of the exponent, where $c$ is a function of $ฮต$ and $k$. For the optimization problem, we show how to incorporate subset advice in a black-box fashion with any $ฮฑ$-approximation algorithm, improving the approximation ratio to $ฮฑ+ (1 - ฮฑ)ฮต$. Specifically, we achieve approximations of $0.94 + ฮฉ(ฮต)$ for MAX-$2$-SAT, $7/8 + ฮฉ(ฮต)$ for MAX-$3$-SAT, and $0.79 + ฮฉ(ฮต)$ for MAX-SAT. Moreover, for label advice, we obtain near-optimal approximation for instances with large average degree, thereby generalizing recent results on MAX-CUT and MAX-$2$-LIN.
Graph-Based Spectral Decomposition for Parameter Coordination in Language Model Fine-Tuning
Zhang, Hanlu, Ma, Yumeng, Wang, Shuo, Liu, Guiran, Zhu, Binrong
This paper proposes a parameter collaborative optimization algorithm for large language models, enhanced with graph spectral analysis. The goal is to improve both fine-tuning efficiency and structural awareness during training. In the proposed method, the parameters of a pre-trained language model are treated as nodes in a graph. A weighted graph is constructed, and Laplacian spectral decomposition is applied to enable frequency-domain modeling and structural representation of the parameter space. Based on this structure, a joint loss function is designed. It combines the task loss with a spectral regularization term to facilitate collaborative updates among parameters. In addition, a spectral filtering mechanism is introduced during the optimization phase. This mechanism adjusts gradients in a structure-aware manner, enhancing the model's training stability and convergence behavior. The method is evaluated on multiple tasks, including traditional fine-tuning comparisons, few-shot generalization tests, and convergence speed analysis. In all settings, the proposed approach demonstrates superior performance. The experimental results confirm that the spectral collaborative optimization framework effectively reduces parameter perturbations and improves fine-tuning quality while preserving overall model performance. This work contributes significantly to the field of artificial intelligence by advancing parameter-efficient training methodologies for large-scale models, reinforcing the importance of structural signal processing in deep learning optimization, and offering a robust, generalizable framework for enhancing language model adaptability and performance.
Frugal Machine Learning for Energy-efficient, and Resource-aware Artificial Intelligence
Violos, John, Diamanti, Konstantina-Christina, Kompatsiaris, Ioannis, Papadopoulos, Symeon
Frugal Machine Learning (FML) refers to the practice of designing Machine Learning (ML) models that are efficient, cost-effective, and mindful of resource constraints. This field aims to achieve acceptable performance while minimizing the use of computational resources, time, energy, and data for both training and inference. FML strategies can be broadly categorized into input frugality, learning process frugality, and model frugality, each focusing on reducing resource consumption at different stages of the ML pipeline. This chapter explores recent advancements, applications, and open challenges in FML, emphasizing its importance for smart environments that incorporate edge computing and IoT devices, which often face strict limitations in bandwidth, energy, or latency. Technological enablers such as model compression, energy-efficient hardware, and data-efficient learning techniques are discussed, along with adaptive methods including parameter regularization, knowledge distillation, and dynamic architecture design that enable incremental model updates without full retraining. Furthermore, it provides a comprehensive taxonomy of frugal methods, discusses case studies across diverse domains, and identifies future research directions to drive innovation in this evolving field.