Optimization
Sequential Multi-Agent Dynamic Algorithm Configuration
Lu, Chen, Xue, Ke, Yuan, Lei, Wang, Yao, Wang, Yaoyuan, Fu, Sheng, Qian, Chao
Dynamic algorithm configuration (DAC) is a recent trend in automated machine learning, which can dynamically adjust the algorithm's configuration during the execution process and relieve users from tedious trial-and-error tuning tasks. Recently, multi-agent reinforcement learning (MARL) approaches have improved the configuration of multiple heterogeneous hyperparameters, making various parameter configurations for complex algorithms possible. However, many complex algorithms have inherent inter-dependencies among multiple parameters (e.g., determining the operator type first and then the operator's parameter), which are, however, not considered in previous approaches, thus leading to sub-optimal results. In this paper, we propose the sequential multi-agent DAC (Seq-MADAC) framework to address this issue by considering the inherent inter-dependencies of multiple parameters. Specifically, we propose a sequential advantage decomposition network, which can leverage action-order information through sequential advantage decomposition. Experiments from synthetic functions to the configuration of multi-objective optimization algorithms demonstrate Seq-MADAC's superior performance over state-of-the-art MARL methods and show strong generalization across problem classes. Seq-MADAC establishes a new paradigm for the widespread dependency-aware automated algorithm configuration. Our code is available at https://github.com/lamda-bbo/seq-madac.
A Literature Review On Stewart-Gough Platform Calibrations A Literature Review On Stewart-Gough Platform Calibrations
Karmakar, Sourabh, Turner, Cameron J.
Researchers have studied Stewart-Gough platforms, also known as Gough-Stewart platforms or hexapod platforms extensively for their inherent fine control characteristics. Their studies led to the potential deployment opportunities of Stewart-Gough Platforms in many critical applications such as the medical field, engineering machines, space research, electronic chip manufacturing, automobile manufacturing, etc. Some of these applications need micro and nano-level movement control in 3D space for the motions to be precise, complicated, and repeatable; a Stewart-Gough platform fulfills these challenges smartly. For this, the platform must be more accurate than the specified application accuracy level and thus proper calibration for a parallel robot is crucial. Forward kinematics-based calibration for these hexapod machines becomes unnecessarily complex and inverse kinematics complete this task with much ease. To experiment with different calibration techniques, various calibration approaches were implemented by using external instruments, constraining one or more motions of the system, and using extra sensors for auto or self-calibration. This survey paid attention to those key methodologies, their outcome, and important details related to inverse kinematic-based parallel robot calibrations. It was observed during this study that the researchers focused on improving the accuracy of the platform position and orientation considering the errors contributed by one source or multiple sources. The error sources considered are mainly kinematic and structural, in some cases, environmental factors also are reviewed, however, those calibrations are done under no-load conditions. This study aims to review the present state of the art in this field and highlight the processes and errors considered for the calibration of Stewart-Gough platforms.
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Weissman, Lowell, Krumdick, Michael, Abbott, A. Lynn
Recent work on neural scaling laws demonstrates that model performance scales predictably with compute budget, model size, and dataset size. In this work, we develop scaling laws based on problem complexity. We analyze two fundamental complexity measures: solution space size and representation space size. Using the Traveling Salesman Problem (TSP) as a case study, we show that combinatorial optimization promotes smooth cost trends, and therefore meaningful scaling laws can be obtained even in the absence of an interpretable loss. We then show that suboptimality grows predictably for fixed-size models when scaling the number of TSP nodes or spatial dimensions, independent of whether the model was trained with reinforcement learning or supervised fine-tuning on a static dataset. We conclude with an analogy to problem complexity scaling in local search, showing that a much simpler gradient descent of the cost landscape produces similar trends.
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Refael, Yehonathan, Smorodinsky, Guy, Tirer, Tom, Lindenbaum, Ofir
Low-rank gradient-based optimization methods have significantly improved memory efficiency during the training of large language models (LLMs), enabling operations within constrained hardware without sacrificing performance. However, these methods primarily emphasize memory savings, often overlooking potential acceleration in convergence due to their reliance on standard isotropic steepest descent techniques, which can perform suboptimally in the highly anisotropic landscapes typical of deep networks, particularly LLMs. In this paper, we propose SUMO (Subspace-Aware Moment-Orthogonalization), an optimizer that employs exact singular value decomposition (SVD) for moment orthogonalization within a dynamically adapted low-dimensional subspace, enabling norm-inducing steepest descent optimization steps. By explicitly aligning optimization steps with the spectral characteristics of the loss landscape, SUMO effectively mitigates approximation errors associated with commonly used methods like Newton-Schulz orthogonalization approximation. We theoretically establish an upper bound on these approximation errors, proving their dependence on the condition numbers of moments, conditions we analytically demonstrate are encountered during LLM training. Furthermore, we both theoretically and empirically illustrate that exact orthogonalization via SVD substantially improves convergence rates while reducing overall complexity. Empirical evaluations confirm that SUMO accelerates convergence, enhances stability, improves performance, and reduces memory requirements by up to 20% compared to state-of-the-art methods.
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
Shetty, Manish, Jain, Naman, Liu, Jinjian, Kethanaboyina, Vijay, Sen, Koushik, Stoica, Ion
Developing high-performance software is a complex task that requires specialized expertise. We introduce GSO, a benchmark for evaluating language models' capabilities in developing high-performance software. We develop an automated pipeline that generates and executes performance tests to analyze repository commit histories to identify 102 challenging optimization tasks across 10 codebases, spanning diverse domains and programming languages. An agent is provided with a codebase and performance test as a precise specification, and tasked to improve the runtime efficiency, which is measured against the expert developer optimization. Our quantitative evaluation reveals that leading SWE-Agents struggle significantly, achieving less than 5% success rate, with limited improvements even with inference-time scaling. Our qualitative analysis identifies key failure modes, including difficulties with low-level languages, practicing lazy optimization strategies, and challenges in accurately localizing bottlenecks. We release the code and artifacts of our benchmark along with agent trajectories to enable future research.
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback
Liu, Wanhao, Yang, Zonglin, Wang, Jue, Bing, Lidong, Zhang, Di, Zhou, Dongzhan, Li, Yuqiang, Li, Houqiang, Cambria, Erik, Ouyang, Wanli
Hypothesis ranking is vital for automated scientific discovery, especially in cost-intensive, throughput-limited natural science domains. Current methods focus on pre-experiment ranking, relying solely on language model reasoning without empirical feedback. We introduce experiment-guided ranking, which prioritizes hypotheses based on feedback from prior tests. Due to the impracticality of real experiments, we propose a simulator grounded in domain-specific concepts that models hypothesis performance as a function of similarity to a hidden ground truth, perturbed by noise. Validated against 124 hypotheses with experimentally reported outcomes, the simulator approximates real results with consistent trend alignment. Although deviations exist, they mimic wet-lab noise, promoting more robust ranking strategies. We frame experiment-guided ranking as a sequential decision-making problem and propose an in-context reinforcement learning (ICRL) framework. Our LLM-based policy decomposes hypotheses into functional elements, clusters them by mechanistic roles, and prioritizes recombinations based on feedback. Experiments show our approach significantly outperforms pre-experiment baselines and strong ablations. Our toolkit, comprising the simulator and ICRL framework, enables systematic research on experiment-guided ranking, with the policy serving as a strong proof of concept.
LyapLock: Bounded Knowledge Preservation in Sequential Large Language Model Editing
Wang, Peng, Zhou, Biyu, Tang, Xuehai, Han, Jizhong, Hu, Songlin
Large Language Models often contain factually incorrect or outdated knowledge, giving rise to model editing methods for precise knowledge updates. However, current mainstream locate-then-edit approaches exhibit a progressive performance decline during sequential editing, due to inadequate mechanisms for long-term knowledge preservation. To tackle this, we model the sequential editing as a constrained stochastic programming. Given the challenges posed by the cumulative preservation error constraint and the gradually revealed editing tasks, \textbf{LyapLock} is proposed. It integrates queuing theory and Lyapunov optimization to decompose the long-term constrained programming into tractable stepwise subproblems for efficient solving. This is the first model editing framework with rigorous theoretical guarantees, achieving asymptotic optimal editing performance while meeting the constraints of long-term knowledge preservation. Experimental results show that our framework scales sequential editing capacity to over 10,000 edits while stabilizing general capabilities and boosting average editing efficacy by 11.89\% over SOTA baselines. Furthermore, it can be leveraged to enhance the performance of baseline methods. Our code is released on https://github.com/caskcsg/LyapLock.
Temporal Robustness in Discrete Time Linear Dynamical Systems
Metya, Nilava, Shah, Ankit, Sinha, Arunesh
Discrete time linear dynamical systems, including Markov chains, have found many applications including in security settings such as in cybersecurity operations center (CSOC) management and in managing health risks. However, in these two scenarios, there is uncertainty about the time horizon for which the system runs. This creates uncertainty about the cost (or reward) incurred based on the state distribution when the system stops. Given past data samples of how long a system ran, we theoretically analyze the cost incurred at the stop of the system as a distributional robust cost estimation task in a Wasserstein ambiguity set. Towards this, we show an equivalence between a discrete time Markov Chain on a probability simplex and a global asymptotic stable (GAS) discrete time linear dynamical system, allowing us to base our study on a GAS system only. Then, we provide various polynomial time algorithms and hardness results for different cases in our theoretical study, including a novel proof of a fundamental result about Wassertein distance based polytope. We experiment with real world data in CSOC domain and prior data in health domain to reveal the benefits of our model and approach.
Bayes-Split-Edge: Bayesian Optimization for Constrained Collaborative Inference in Wireless Edge Systems
Safaeipour, Fatemeh Zahra, Chakareski, Jacob, Hashemi, Morteza
Mobile edge devices (e.g., AR/VR headsets) typically need to complete timely inference tasks while operating with limited on-board computing and energy resources. In this paper, we investigate the problem of collaborative inference in wireless edge networks, where energy-constrained edge devices aim to complete inference tasks within given deadlines. These tasks are carried out using neural networks, and the edge device seeks to optimize inference performance under energy and delay constraints. The inference process can be split between the edge device and an edge server, thereby achieving collaborative inference over wireless networks. We formulate an inference utility optimization problem subject to energy and delay constraints, and propose a novel solution called Bayes-Split-Edge, which leverages Bayesian optimization for collaborative split inference over wireless edge networks. Our solution jointly optimizes the transmission power and the neural network split point. The Bayes-Split-Edge framework incorporates a novel hybrid acquisition function that balances inference task utility, sample efficiency, and constraint violation penalties. We evaluate our approach using the VGG19 model on the ImageNet-Mini dataset, and Resnet101 on Tiny-ImageNet, and real-world mMobile wireless channel datasets. Numerical results demonstrate that Bayes-Split-Edge achieves up to 2.4x reduction in evaluation cost compared to standard Bayesian optimization and achieves near-linear convergence. It also outperforms several baselines, including CMA-ES, DIRECT, exhaustive search, and Proximal Policy Optimization (PPO), while matching exhaustive search performance under tight constraints. These results confirm that the proposed framework provides a sample-efficient solution requiring maximum 20 function evaluations and constraint-aware optimization for wireless split inference in edge computing systems.
BBOPlace-Bench: Benchmarking Black-Box Optimization for Chip Placement
Xue, Ke, Chen, Ruo-Tong, Tan, Rong-Xi, Lin, Xi, Shi, Yunqi, Xu, Siyuan, Yuan, Mingxuan, Qian, Chao
Abstract--Chip placement is a vital stage in modern chip design as it has a substantial impact on the subsequent processes and the overall quality of the final chip. The use of black-box optimization (BBO) for chip placement has a history of several decades. However, early efforts were limited by immature problem formulations and inefficient algorithm designs, leading to suboptimal efficiency, quality, and scalability, compared to the more prevalent analytical methods. Recent progress in problem formulation and algorithm design has shown the effectiveness and efficiency of BBO for chip placement, proving its potential to achieve state-of-the-art results. Despite these advancements, the field lacks a unified, BBO-specific benchmark for thoroughly assessing various problem formulations and BBO algorithms. T o fill this gap, we propose BBOPlace-Bench, the first benchmark designed specifically for evaluating and developing BBO algorithms for chip placement tasks. It integrates three problem formulations (with permutation, continuous, and mixed search spaces, respectively) of BBO for chip placement, and offers a modular, decoupled, and flexible framework that enables users to seamlessly implement, test, and compare their own algorithms. BBOPlace-Bench aggregates modern chip cases from representative chip cases (ISPD 2005, ICCAD 2015) and standardizes their formats, providing uniform and comprehensive information to support BBO optimization. Moreover, it integrates a wide variety of existing BBO algorithms, including simulated annealing (SA), evolutionary algorithms (EAs), and Bayesian optimization (BO), and systematically evaluates their performance across different problem formulations using key metrics (e.g., macro placement wirelength and global placement wirelength) of chip. Experimental results show that the problem formulations of mask-guided optimization and hyperparameter optimization exhibit superior performance than the sequence pair problem formulation, while EAs demonstrate better overall performance than SA and BO, especially in high-dimensional search spaces, and also achieve state-of-the-art performance compared to the mainstream chip placement methods, i.e., analytical methods and reinforcement learning methods. BBOPlace-Bench not only facilitates the development of efficient BBO-driven solutions for chip placement but also broadens the practical application scenarios (which are urgently needed) for the BBO community. The code of BBOPlace-Bench is available at https://github.com/ The first three authors contributed equally.