Optimization
TOCALib: Optimal control library with interpolation for bimanual manipulation and obstacles avoidance
Danik, Yulia, Makarov, Dmitry, Arkhipova, Aleksandra, Davidenko, Sergei, Panov, Aleksandr
TOCALib: Optimal control library with interpolation for bimanual manipulation and obstacles avoidance Y ulia Danik 1, Dmitry Makarov 2, Aleksandra Arkhipova 3, Sergei Davidenko 4 and Aleksandr Panov 5 Abstract -- The paper presents a new approach for constructing a library of optimal trajectories for two robotic manipulators, Two-Arm Optimal Control and A voidance Library (TOCALib) 1 . The optimisation takes into account kinodynamic and other constraints within the FROST framework. The novelty of the method lies in the consideration of collisions using the DCOL method, which allows obtaining symbolic expressions for assessing the presence of collisions and using them in gradient-based optimization control methods. The proposed approach allowed the implementation of complex bimanual manipulations. In this paper we used Mobile Aloha as an example of TOCALib application. The approach can be extended to other bimanual robots, as well as to gait control of bipedal robots. It can also be used to construct training data for machine learning tasks for manipulation.
Stochastic Smoothed Primal-Dual Algorithms for Nonconvex Optimization with Linear Inequality Constraints
Huang, Ruichuan, Zhang, Jiawei, Alacaoglu, Ahmet
We propose smoothed primal-dual algorithms for solving stochastic and smooth nonconvex optimization problems with linear inequality constraints. Our algorithms are single-loop and only require a single stochastic gradient based on one sample at each iteration. A distinguishing feature of our algorithm is that it is based on an inexact gradient descent framework for the Moreau envelope, where the gradient of the Moreau envelope is estimated using one step of a stochastic primal-dual augmented Lagrangian method. To handle inequality constraints and stochasticity, we combine the recently established global error bounds in constrained optimization with a Moreau envelope-based analysis of stochastic proximal algorithms. For obtaining $\varepsilon$-stationary points, we establish the optimal $O(\varepsilon^{-4})$ sample complexity guarantee for our algorithms and provide extensions to stochastic linear constraints. We also show how to improve this complexity to $O(\varepsilon^{-3})$ by using variance reduction and the expected smoothness assumption. Unlike existing methods, the iterations of our algorithms are free of subproblems, large batch sizes or increasing penalty parameters and use dual variable updates to ensure feasibility.
Bottleneck Identification in Resource-Constrained Project Scheduling via Constraint Relaxation
Nedbรกlek, Lukรกลก, Novรกk, Antonรญn
Keywords: scheduling, RCPSP, bottlenecks, constraint relaxation Abstract: In realistic production scenarios, Advanced Planning and Scheduling (APS) tools often require manual intervention by production planners, as the system works with incomplete information, resulting in suboptimal schedules. Often, the preferable solution is not found just because of the too-restrictive constraints specifying the optimization problem, representing bottlenecks in the schedule. To provide computer-assisted support for decision-making, we aim to automatically identify bottlenecks in the given schedule while linking them to the particular constraints to be relaxed. In this work, we address the problem of reducing the tardiness of a particular project in an obtained schedule in the resource-constrained project scheduling problem by relaxing constraints related to identified bottlenecks. We develop two methods for this purpose. The second method identifies potential improvements in relaxed versions of the problem and proposes targeted relaxations. Surprisingly, the untargeted relaxations result in improvements comparable to the targeted relaxations. 1 INTRODUCTION In the modern manufacturing industry, Advanced Planning and Scheduling (APS) tools are used to schedule production automatically. However, not all parameters and information are available to the APS systems in practice.
PROPEL: Supervised and Reinforcement Learning for Large-Scale Supply Chain Planning
Akhlaghi, Vahid Eghbal, Zandehshahvar, Reza, Van Hentenryck, Pascal
This paper considers how to fuse Machine Learning (ML) and optimization to solve large-scale Supply Chain Planning (SCP) optimization problems. These problems can be formulated as MIP models which feature both integer (non-binary) and continuous variables, as well as flow balance and capacity constraints. This raises fundamental challenges for existing integrations of ML and optimization that have focused on binary MIPs and graph problems. To address these, the paper proposes PROPEL, a new framework that combines optimization with both supervised and Deep Reinforcement Learning (DRL) to reduce the size of search space significantly. PROPEL uses supervised learning, not to predict the values of all integer variables, but to identify the variables that are fixed to zero in the optimal solution, leveraging the structure of SCP applications. PROPEL includes a DRL component that selects which fixed-at-zero variables must be relaxed to improve solution quality when the supervised learning step does not produce a solution with the desired optimality tolerance. PROPEL has been applied to industrial supply chain planning optimizations with millions of variables. The computational results show dramatic improvements in solution times and quality, including a 60% reduction in primal integral and an 88% primal gap reduction, and improvement factors of up to 13.57 and 15.92, respectively.
DOMAC: Differentiable Optimization for High-Speed Multipliers and Multiply-Accumulators
Xue, Chenhao, Ren, Yi, Zhou, Jinwei, Li, Kezhi, Zhang, Chen, Lin, Yibo, Zhang, Lining, Xu, Qiang, Sun, Guangyu
Multipliers and multiply-accumulators (MACs) are fundamental building blocks for compute-intensive applications such as artificial intelligence. With the diminishing returns of Moore's Law, optimizing multiplier performance now necessitates process-aware architectural innovations rather than relying solely on technology scaling. In this paper, we introduce DOMAC, a novel approach that employs differentiable optimization for designing multipliers and MACs at specific technology nodes. DOMAC establishes an analogy between optimizing multi-staged parallel compressor trees and training deep neural networks. Building on this insight, DOMAC reformulates the discrete optimization challenge into a continuous problem by incorporating differentiable timing and area objectives. This formulation enables us to utilize existing deep learning toolkit for highly efficient implementation of the differentiable solver. Experimental results demonstrate that DOMAC achieves significant enhancements in both performance and area efficiency compared to state-of-the-art baselines and commercial IPs in multiplier and MAC designs.
Gradient-based Sample Selection for Faster Bayesian Optimization
Wei, Qiyu, Wang, Haowei, Cao, Zirui, Wang, Songhao, Allmendinger, Richard, รlvarez, Mauricio A
Bayesian optimization (BO) is an effective technique for black-box optimization. However, its applicability is typically limited to moderate-budget problems due to the cubic complexity in computing the Gaussian process (GP) surrogate model. In large-budget scenarios, directly employing the standard GP model faces significant challenges in computational time and resource requirements. In this paper, we propose a novel approach, gradient-based sample selection Bayesian Optimization (GSSBO), to enhance the computational efficiency of BO. The GP model is constructed on a selected set of samples instead of the whole dataset. These samples are selected by leveraging gradient information to maintain diversity and representation. We provide a theoretical analysis of the gradient-based sample selection strategy and obtain explicit sublinear regret bounds for our proposed framework. Extensive experiments on synthetic and real-world tasks demonstrate that our approach significantly reduces the computational cost of GP fitting in BO while maintaining optimization performance comparable to baseline methods.
CRYSIM: Prediction of Symmetric Structures of Large Crystals with GPU-based Ising Machines
Liang, Chen, Das, Diptesh, Guo, Jiang, Tamura, Ryo, Mao, Zetian, Tsuda, Koji
Solving black-box optimization problems with Ising machines is increasingly common in materials science. However, their application to crystal structure prediction (CSP) is still ineffective due to symmetry agnostic encoding of atomic coordinates. We introduce CRYSIM, an algorithm that encodes the space group, the Wyckoff positions combination, and coordinates of independent atomic sites as separate variables. This encoding reduces the search space substantially by exploiting the symmetry in space groups. When CRYSIM is interfaced to Fixstars Amplify, a GPU-based Ising machine, its prediction performance was competitive with CALYPSO and Bayesian optimization for crystals containing more than 150 atoms in a unit cell. Although it is not realistic to interface CRYSIM to current small-scale quantum devices, it has the potential to become the standard CSP algorithm in the coming quantum age.
ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language Models
Park, Seonghwan, Jeong, Jaehyeon, Kim, Yongjun, Lee, Jaeho, Lee, Namhoon
Recent studies have introduced various approaches for prompt-tuning black-box vision-language models, referred to as black-box prompt-tuning (BBPT). While BBPT has demonstrated considerable potential, it is often found that many existing methods require an excessive number of queries (i.e., function evaluations), which poses a significant challenge in real-world scenarios where the number of allowed queries is limited. To tackle this issue, we propose Zeroth-order Intrinsic-dimensional Prompt-tuning (ZIP), a novel approach that enables efficient and robust prompt optimization in a purely black-box setting. The key idea of ZIP is to reduce the problem dimensionality and the variance of zeroth-order gradient estimates, such that the training is done fast with far less queries. We achieve this by re-parameterizing prompts in low-rank representations and designing intrinsic-dimensional clipping of estimated gradients. We evaluate ZIP on 13+ vision-language tasks in standard benchmarks and show that it achieves an average improvement of approximately 6% in few-shot accuracy and 48% in query efficiency compared to the best-performing alternative BBPT methods, establishing a new state of the art. Our ablation analysis further shows that the proposed clipping mechanism is robust and nearly optimal, without the need to manually select the clipping threshold, matching the result of expensive hyperparameter search.
Bridging the Gap Between Preference Alignment and Machine Unlearning
Feng, Xiaohua, Li, Yuyuan, Ji, Huwei, Zhang, Jiaming, Zhang, Li, Du, Tianyu, Chen, Chaochao
Despite advances in Preference Alignment (PA) for Large Language Models (LLMs), mainstream methods like Reinforcement Learning with Human Feedback (RLHF) face notable challenges. These approaches require high-quality datasets of positive preference examples, which are costly to obtain and computationally intensive due to training instability, limiting their use in low-resource scenarios. LLM unlearning technique presents a promising alternative, by directly removing the influence of negative examples. However, current research has primarily focused on empirical validation, lacking systematic quantitative analysis. To bridge this gap, we propose a framework to explore the relationship between PA and LLM unlearning. Specifically, we introduce a bi-level optimization-based method to quantify the impact of unlearning specific negative examples on PA performance. Our analysis reveals that not all negative examples contribute equally to alignment improvement when unlearned, and the effect varies significantly across examples. Building on this insight, we pose a crucial question: how can we optimally select and weight negative examples for unlearning to maximize PA performance? To answer this, we propose a framework called Unlearning to Align (U2A), which leverages bi-level optimization to efficiently select and unlearn examples for optimal PA performance. We validate the proposed method through extensive experiments, with results confirming its effectiveness.
The Power of the Pareto Front: Balancing Uncertain Rewards for Adaptive Experimentation in scanning probe microscopy
Abstract: Automated experimentation has the potential to revolutionize scientific discovery, but its effectiveness depends on well - defined optimization targets, which are often uncertain or probabilistic in real - world settings. In this work, we demonstrate the appli cation of Multi - Objective Bayesian Optimization ( MOBO) to balance multiple, competing rewards in autonomous experimentation. Using scanning probe microscopy ( SPM) imaging, one of the most widely used and foundational SPM modes, we show that MOBO can optimize imaging parameters to enhance measurement quality, reproducibility, and efficiency. A key advantage of this approach is the ability to compute and analyze the Pareto front, which not only guides optimization but also provides physical insights into the trade - offs between different objectives. Additionally, MOBO offers a natural framework for human - in - the - loop decision - making, enabling researchers to fine - tune ex perimental trade - offs based on domain expertise. By standardizing high - quality, reproducible measurements and integrating human input into AI - driven optimization, this work highlights MOBO as a powerful tool for advancing autonomous scientific discovery. I. Introduction Automated scientific discovery is rapidly emerging as a transformative research paradigm, reshaping experimental methodologies through the integration of automated instrumentation, AI - driven decision - making, and multi - tool workflows [1, 2] . By enabling autonomous hypothesis testing, adaptive experimentation, and real - time optimization, these systems have the potential to significantly accelerate discoveries across various scientific domains [18 - 21] . A fundamental requirement for active discovery workflows is the definition of optimization targets or reward functions that drive the iterative learning process [18] . These reward functions form the foundation of autonomous workflows, guiding experimental decisions and facilitating interoperability among multiple tools in complex research environments.