Goto

Collaborating Authors

 Optimization


AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance

arXiv.org Artificial Intelligence

Designing sequences that satisfy multiple, often conflicting, objectives is a central challenge in therapeutic and biomolecular engineering. Existing generative frameworks largely operate in continuous spaces with single-objective guidance, while discrete approaches lack guarantees for multi-objective Pareto optimality. We introduce AReUReDi (Annealed Rectified Updates for Refining Discrete Flows), a discrete optimization algorithm with theoretical guarantees of convergence to the Pareto front. Building on Rectified Discrete Flows (ReDi), AReUReDi combines Tchebycheff scalarization, locally balanced proposals, and annealed Metropolis-Hastings updates to bias sampling toward Pareto-optimal states while preserving distributional invariance. Applied to peptide and SMILES sequence design, AReUReDi simultaneously optimizes up to five therapeutic properties (including affinity, solubility, hemolysis, half-life, and non-fouling) and outperforms both evolutionary and diffusion-based baselines. These results establish AReUReDi as a powerful, sequence-based framework for multi-property biomolecule generation.


Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach

arXiv.org Artificial Intelligence

This paper investigates a fundamental yet underexplored issue in Salient Object Detection (SOD): the size-invariant property for evaluation protocols, particularly in scenarios when multiple salient objects of significantly different sizes appear within a single image. We first present a novel perspective to expose the inherent size sensitivity of existing widely used SOD metrics. Through careful theoretical derivations, we show that the evaluation outcome of an image under current SOD metrics can be essentially decomposed into a sum of several separable terms, with the contribution of each term being directly proportional to its corresponding region size. Consequently, the prediction errors would be dominated by the larger regions, while smaller yet potentially more semantically important objects are often overlooked, leading to biased performance assessments and practical degradation. To address this challenge, a generic Size-Invariant Evaluation (SIEva) framework is proposed. The core idea is to evaluate each separable component individually and then aggregate the results, thereby effectively mitigating the impact of size imbalance across objects. Building upon this, we further develop a dedicated optimization framework (SIOpt), which adheres to the size-invariant principle and significantly enhances the detection of salient objects across a broad range of sizes. Notably, SIOpt is model-agnostic and can be seamlessly integrated with a wide range of SOD backbones. Theoretically, we also present generalization analysis of SOD methods and provide evidence supporting the validity of our new evaluation protocols. Finally, comprehensive experiments speak to the efficacy of our proposed approach. The code is available at https://github.com/Ferry-Li/SI-SOD.


Preconditioned subgradient method for composite optimization: overparameterization and fast convergence

arXiv.org Artificial Intelligence

Composite optimization problems involve minimizing the composition of a smooth map with a convex function. Such objectives arise in numerous data science and signal processing applications, including phase retrieval, blind deconvolution, and collaborative filtering. The subgradient method achieves local linear convergence when the composite loss is well-conditioned. However, if the smooth map is, in a certain sense, ill-conditioned or overparameterized, the subgradient method exhibits much slower sublinear convergence even when the convex function is well-conditioned. To overcome this limitation, we introduce a Levenberg-Morrison-Marquardt subgradient method that converges linearly under mild regularity conditions at a rate determined solely by the convex function. Further, we demonstrate that these regularity conditions hold for several problems of practical interest, including square-variable formulations, matrix sensing, and tensor factorization. Numerical experiments illustrate the benefits of our method.


Joint Bidding on Intraday and Frequency Containment Reserve Markets

arXiv.org Artificial Intelligence

As renewable energy integration increases supply variability, battery energy storage systems (BESS) present a viable solution for balancing supply and demand. This paper proposes a novel approach for optimizing battery BESS participation in multiple electricity markets. We develop a joint bidding strategy that combines participation in the primary frequency reserve market with continuous trading in the intraday market, addressing a gap in the extant literature which typically considers these markets in isolation or simplifies the continuous nature of intraday trading. Our approach utilizes a mixed integer linear programming implementation of the rolling intrinsic algorithm for intraday decisions and state of charge recovery, alongside a learned classifier strategy (LCS) that determines optimal capacity allocation between markets. A comprehensive out-of-sample backtest over more than one year of historical German market data validates our approach: The LCS increases overall profits by over 4% compared to the best-performing static strategy and by more than 3% over a naive dynamic benchmark. Crucially, our method closes the gap to a theoretical perfect foresight strategy to just 4%, demonstrating the effectiveness of dynamic, learning-based allocation in a complex, multi-market environment.


Bootstrap Learning for Combinatorial Graph Alignment with Sequential GNNs

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) have struggled to outperform traditional optimization methods on combinatorial problems, limiting their practical impact. We address this gap by introducing a novel chaining procedure for the graph alignment problem--a fundamental NP-hard task of finding optimal node correspondences between unlabeled graphs using only structural information. Our method trains a sequence of GNNs where each network learns to iteratively refine similarity matrices produced by previous networks. During inference, this creates a bootstrap effect: each GNN improves upon partial solutions by incorporating discrete ranking information about node alignment quality from prior iterations. We combine this with a powerful architecture that operates on node pairs rather than individual nodes, capturing global structural patterns essential for alignment that standard message-passing networks cannot represent. Extensive experiments on synthetic benchmarks demonstrate substantial improvements: our chained GNNs achieve over 3 better accuracy than existing methods on challenging instances, and uniquely solve regular graphs where all competing approaches fail. When combined with traditional optimization as post-processing, our method substantially outperforms state-of-the-art solvers on the graph alignment benchmark. "Combinatorial optimization searches for an optimum object in a finite collection of objects. Typically, the collection has a concise representation (like a graph), while the number of objects is huge."(Schrijver


ZeroShotOpt: Towards Zero-Shot Pretrained Models for Efficient Black-Box Optimization

arXiv.org Artificial Intelligence

Global optimization of expensive, derivative-free black-box functions requires extreme sample efficiency. While Bayesian optimization (BO) is the current state-of-the-art, its performance hinges on surrogate and acquisition function hyper-parameters that are often hand-tuned and fail to generalize across problem landscapes. We present ZeroShotOpt, a general-purpose, pretrained model for continuous black-box optimization tasks ranging from 2D to 20D. Our approach leverages offline reinforcement learning on large-scale optimization trajectories collected from 12 BO variants. To scale pretraining, we generate millions of synthetic Gaussian process-based functions with diverse landscapes, enabling the model to learn transferable optimization policies. As a result, ZeroShotOpt achieves robust zero-shot generalization on a wide array of unseen benchmarks, matching or surpassing the sample efficiency of leading global optimizers, including BO, while also offering a reusable foundation for future extensions and improvements. Our open-source code, dataset, and model are available at: https://github.com/jamisonmeindl/zeroshotopt


FR-LUX: Friction-Aware, Regime-Conditioned Policy Optimization for Implementable Portfolio Management

arXiv.org Artificial Intelligence

Transaction costs and regime shifts are major reasons why paper portfolios fail in live trading. We introduce FR-LUX (Friction-aware, Regime-conditioned Learning under eXecution costs), a reinforcement learning framework that learns after-cost trading policies and remains robust across volatility-liquidity regimes. FR-LUX integrates three ingredients: (i) a microstructure-consistent execution model combining proportional and impact costs, directly embedded in the reward; (ii) a trade-space trust region that constrains changes in inventory flow rather than logits, yielding stable low-turnover updates; and (iii) explicit regime conditioning so the policy specializes to LL/LH/HL/HH states without fragmenting the data. On a 4 x 5 grid of regimes and cost levels with multiple random seeds, FR-LUX achieves the top average Sharpe ratio with narrow bootstrap confidence intervals, maintains a flatter cost-performance slope than strong baselines, and attains superior risk-return efficiency for a given turnover budget. Pairwise scenario-level improvements are strictly positive and remain statistically significant after multiple-testing corrections. We provide formal guarantees on optimality under convex frictions, monotonic improvement under a KL trust region, long-run turnover bounds and induced inaction bands due to proportional costs, positive value advantage for regime-conditioned policies, and robustness to cost misspecification. The methodology is implementable: costs are calibrated from standard liquidity proxies, scenario-level inference avoids pseudo-replication, and all figures and tables are reproducible from released artifacts.


Delay-Tolerant Augmented-Consensus-based Distributed Directed Optimization

arXiv.org Artificial Intelligence

Distributed optimization finds applications in large-scale machine learning, data processing and classification over multi-agent networks. In real-world scenarios, the communication network of agents may encounter latency that may affect the convergence of the optimization protocol. This paper addresses the case where the information exchange among the agents (computing nodes) over data-transmission channels (links) might be subject to communication time-delays, which is not well addressed in the existing literature. Our proposed algorithm improves the state-of-the-art by handling heterogeneous and arbitrary but bounded and fixed (time-invariant) delays over general strongly-connected directed networks. Arguments from matrix theory, algebraic graph theory, and augmented consensus formulation are applied to prove the convergence to the optimal value. Simulations are provided to verify the results and compare the performance with some existing delay-free algorithms.


Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets

arXiv.org Artificial Intelligence

Conventional supervised learning methods are often vulnerable to spurious correlations, particularly under distribution shifts in test data. To address this issue, several approaches, most notably Group DRO, have been developed. While these methods are highly robust to subpopulation or group shifts, they remain vulnerable to intra-group distributional shifts, which frequently occur in minority groups with limited samples. We propose a hierarchical extension of Group DRO that addresses both inter-group and intra-group uncertainties, providing robustness to distribution shifts at multiple levels. We also introduce new benchmark settings that simulate realistic minority group distribution shifts-an important yet previously underexplored challenge in spurious correlation research. Our method demonstrates strong robustness under these conditions-where existing robust learning methods consistently fail-while also achieving superior performance on standard benchmarks. These results highlight the importance of broadening the ambiguity set to better capture both inter-group and intra-group distributional uncertainties.


OptunaHub: A Platform for Black-Box Optimization

arXiv.org Artificial Intelligence

Black-box optimization (BBO) drives advances in domains such as AutoML and Materials Informatics, yet research efforts often remain fragmented across domains. We introduce OptunaHub (https://hub.optuna.org/), a community platform that centralizes BBO methods and benchmarks. OptunaHub provides unified Python APIs, a contributor package registry, and a web interface to promote searchability and cross-domain research. OptunaHub aims to foster a virtuous cycle of contributions and applications. The source code is publicly available in the optunahub, optunahub-registry, and optunahub-web repositories under the Optuna organization on GitHub (https://github.com/optuna/).