Goto

Collaborating Authors

 Optimization


Transferable Graph Learning for Transmission Congestion Management via Busbar Splitting

arXiv.org Artificial Intelligence

Network topology optimization (NTO) via busbar splitting can mitigate transmission grid congestion and reduce redispatch costs. However, solving this mixed-integer non-linear problem for large-scale systems in near-real-time is currently intractable with existing solvers. Machine learning (ML) approaches have emerged as a promising alternative, but they have limited generalization to unseen topologies, varying operating conditions, and different systems, which limits their practical applicability. This paper formulates NTO for congestion management problem considering linearized AC PF, and proposes a graph neural network (GNN)-accelerated approach. We develop a heterogeneous edge-aware message passing NN to predict effective busbar splitting actions as candidate NTO solutions. The proposed GNN captures local flow patterns, achieves generalization to unseen topology changes, and improves transferability across systems. Case studies show up to 4 orders-of-magnitude speed-up, delivering AC-feasible solutions within one minute and a 2.3% optimality gap on the GOC 2000-bus system. These results demonstrate a significant step toward near-real-time NTO for large-scale systems with topology and cross-system generalization.


A Parameter-Linear Formulation of the Optimal Path Following Problem for Robotic Manipulator

arXiv.org Artificial Intelligence

In this paper the computational challenges of time-optimal path following are addressed. The standard approach is to minimize the travel time, which inevitably leads to singularities at zero path speed, when reformulating the optimization problem in terms of a path parameter. Thus, smooth trajectory generation while maintaining a low computational effort is quite challenging, since the singularities have to be taken into account. To this end, a different approach is presented in this paper. This approach is based on maximizing the path speed along a prescribed path. Furthermore, the approach is capable of planning smooth trajectories numerically efficient. Moreover, the discrete reformulation of the underlying problem is linear in optimization variables.


Behavior-Aware Online Prediction of Obstacle Occupancy using Zonotopes

arXiv.org Artificial Intelligence

Abstract-- Predicting the motion of surrounding vehicles is key to safe autonomous driving, especially in unstructured environments without prior information. This paper proposes a novel online method to accurately predict the occupancy sets of surrounding vehicles based solely on motion observations. The approach is divided into two stages: first, an Extended Kalman Filter and a Linear Programming (LP) problem are used to estimate a compact zonotopic set of control actions; then, a reachability analysis propagates this set to predict future occupancy. The effectiveness of the method has been validated through simulations in an urban environment, showing accurate and compact predictions without relying on prior assumptions or prior training data. I. INTRODUCTION Autonomous driving has generated great research interests given the expected benefits, such as reducing accidents, optimizing traffic efficiency and energy management [1]. However, ensuring safety remains a major challenge, particularly in urban environments, where multiple agents interact dynamically [2].Predicting the motion of surrounding vehicles (SVs) is critical to designing safe motion planning and control strategies for autonomous vehicles.


Partial Optimality in Cubic Correlation Clustering for General Graphs

arXiv.org Artificial Intelligence

The higher-order correlation clustering problem for a graph $G$ and costs associated with cliques of $G$ consists in finding a clustering of $G$ so as to minimize the sum of the costs of those cliques whose nodes all belong to the same cluster. To tackle this NP-hard problem in practice, local search heuristics have been proposed and studied in the context of applications. Here, we establish partial optimality conditions for cubic correlation clustering, i.e., for the special case of at most 3-cliques. We define and implement algorithms for deciding these conditions and examine their effectiveness numerically, on two data sets.


Why DPO is a Misspecified Estimator and How to Fix It

arXiv.org Artificial Intelligence

Direct alignment algorithms such as Direct Preference Optimization (DPO) fine-tune models based on preference data, using only supervised learning instead of two-stage reinforcement learning with human feedback (RLHF). We show that DPO encodes a statistical estimation problem over reward functions induced by a parametric policy class. When the true reward function that generates preferences cannot be realized via the policy class, DPO becomes misspecified, resulting in failure modes such as preference order reversal, worsening of policy reward, and high sensitivity to the input preference data distribution. On the other hand, we study the local behavior of two-stage RLHF for a parametric class and relate it to a natural gradient step in policy space. Our fine-grained geometric characterization allows us to propose AuxDPO, which introduces additional auxiliary variables in the DPO loss function to help move towards the RLHF solution in a principled manner and mitigate the misspecification in DPO. We empirically demonstrate the superior performance of AuxDPO on didactic bandit settings as well as LLM alignment tasks.


Hierarchical Time Series Forecasting with Robust Reconciliation

arXiv.org Artificial Intelligence

This paper focuses on forecasting hierarchical time-series data, where each higher-level observation equals the sum of its corresponding lower-level time series. In such contexts, the forecast values should be coherent, meaning that the forecast value of each parent series exactly matches the sum of the forecast values of its child series. Existing hierarchical forecasting methods typically generate base forecasts independently for each series and then apply a reconciliation procedure to adjust them so that the resulting forecast values are coherent across the hierarchy. These methods generally derive an optimal reconciliation, using a covariance matrix of the forecast error. In practice, however, the true covariance matrix is unknown and has to be estimated from finite samples in advance. This gap between the true and estimated covariance matrix may degrade forecast performance. To address this issue, we propose a robust optimization framework for hierarchical reconciliation that accounts for uncertainty in the estimated covariance matrix. We first introduce an uncertainty set for the estimated covariance matrix and formulate a reconciliation problem that minimizes the worst-case expected squared error over this uncertainty set. We show that our problem can be cast as a semidefinite optimization problem. Numerical experiments demonstrate that the proposed robust reconciliation method achieved better forecast performance than existing hierarchical forecasting methods, which indicates the effectiveness of integrating uncertainty into the reconciliation process.


LEGO: A Lightweight and Efficient Multiple-Attribute Unlearning Framework for Recommender Systems

arXiv.org Artificial Intelligence

With the growing demand for safeguarding sensitive user information in recommender systems, recommendation attribute unlearning is receiving increasing attention. Existing studies predominantly focus on single-attribute unlearning. However, privacy protection requirements in the real world often involve multiple sensitive attributes and are dynamic. Existing single-attribute unlearning methods cannot meet these real-world requirements due to i) CH1: the inability to handle multiple unlearning requests simultaneously, and ii) CH2: the lack of efficient adaptability to dynamic unlearning needs. To address these challenges, we propose LEGO, a lightweight and efficient multiple-attribute unlearning framework. Specifically, we divide the multiple-attribute unlearning process into two steps: i) Embedding Calibration removes information related to a specific attribute from user embedding, and ii) Flexible Combination combines these embeddings into a single embedding, protecting all sensitive attributes. We frame the unlearning process as a mutual information minimization problem, providing LEGO a theoretical guarantee of simultaneous unlearning, thereby addressing CH1. With the two-step framework, where Embedding Calibration can be performed in parallel and Flexible Combination is flexible and efficient, we address CH2. Extensive experiments on three real-world datasets across three representative recommendation models demonstrate the effectiveness and efficiency of our proposed framework. Our code and appendix are available at https://github.com/anonymifish/lego-rec-multiple-attribute-unlearning.


Alternatives to the Laplacian for Scalable Spectral Clustering with Group Fairness Constraints

arXiv.org Artificial Intelligence

Recent research has focused on mitigating algorithmic bias in clustering by incorporating fairness constraints into algorithmic design. Notions such as disparate impact, community cohesion, and cost per population have been implemented to enforce equitable outcomes. Among these, group fairness (balance) ensures that each protected group is proportionally represented within every cluster. However, incorporating balance as a metric of fairness into spectral clustering algorithms has led to computational times that can be improved. This study aims to enhance the efficiency of spectral clustering algorithms by reformulating the constrained optimization problem using a new formulation derived from the Lagrangian method and the Sherman-Morrison-Woodbury (SMW) identity, resulting in the Fair-SMW algorithm. Fair-SMW employs three alternatives to the Laplacian matrix with different spectral gaps to generate multiple variations of Fair-SMW, achieving clustering solutions with comparable balance to existing algorithms while offering improved runtime performance. We present the results of Fair-SMW, evaluated using the Stochastic Block Model (SBM) to measure both runtime efficiency and balance across real-world network datasets, including LastFM, FacebookNet, Deezer, and German. We achieve an improvement in computation time that is twice as fast as the state-of-the-art, and also flexible enough to achieve twice as much balance.


From Optimization to Prediction: Transformer-Based Path-Flow Estimation to the Traffic Assignment Problem

arXiv.org Artificial Intelligence

The traffic assignment problem is essential for traffic flow analysis, traditionally solved using mathematical programs under the Equilibrium principle. These methods become computationally prohibitive for large-scale networks due to non-linear growth in complexity with the number of OD pairs. This study introduces a novel data-driven approach using deep neural networks, specifically leveraging the Transformer architecture, to predict equilibrium path flows directly. By focusing on path-level traffic distribution, the proposed model captures intricate correlations between OD pairs, offering a more detailed and flexible analysis compared to traditional link-level approaches. The Transformer-based model drastically reduces computation time, while adapting to changes in demand and network structure without the need for recalculation. Numerical experiments are conducted on the Manhattan-like synthetic network, the Sioux Falls network, and the Eastern-Massachusetts network. The results demonstrate that the proposed model is orders of magnitude faster than conventional optimization. It efficiently estimates path-level traffic flows in multi-class networks, reducing computational costs and improving prediction accuracy by capturing detailed trip and flow information. Introduction The Traffic Assignment Problem (TAP) is a process of determining the propagation of flows over the transportation network. The goal is to calculate the network state, given the travel demand between various origin-destination (OD) pairs and the network's capacity constraints (Y osef Sheffi. Traditionally, this problem is solved through mathematical programs under the User Equilibrium (UE) principle, which assumes drivers possess perfect information and make fully rational choices (Wardrop, 1952). Despite potential deviations from reality, this approach consistently provides reasonable solutions to the traffic assignment problem (Bar-Gera, 2002; Jafari et al., 2017). However, the computation for determining optimal solutions in large traffic networks is prohibitively costly. This is because the problem's complexity grows non-linearly with the increase in the number of OD pairs and directly depends on feasible paths. When the size of the network (the number of links and nodes in a representative graph) increases, allowing us to explore more paths, the number of feasible paths also increases, and the OD demand matrix may grow accordingly, leading to a non-linear increase in computation time (Patriksson, 2015).


A Quantum-Inspired Algorithm for Solving Sudoku Puzzles and the MaxCut Problem

arXiv.org Artificial Intelligence

We propose and evaluate a quantum-inspired algorithm for solving Quadratic Unconstrained Binary Optimization (QUBO) problems, which are mathematically equivalent to finding ground states of Ising spin-glass Hamiltonians. The algorithm employs Matrix Product States (MPS) to compactly represent large superpositions of spin configurations and utilizes a discrete driving schedule to guide the MPS toward the ground state. At each step, a driver Hamiltonian -- incorporating a transverse magnetic field -- is combined with the problem Hamiltonian to enable spin flips and facilitate quantum tunneling. The MPS is updated using the standard Density Matrix Renormalization Group (DMRG) method, which iteratively minimizes the system's energy via multiple sweeps across the spin chain. Despite its heuristic nature, the algorithm reliably identifies global minima, not merely near-optimal solutions, across diverse QUBO instances. We first demonstrate its effectiveness on intermediate-level Sudoku puzzles from publicly available sources, involving over $200$ Ising spins with long-range couplings dictated by constraint satisfaction. We then apply the algorithm to MaxCut problems from the Biq Mac library, successfully solving instances with up to $251$ nodes and $3,265$ edges. We discuss the advantages of this quantum-inspired approach, including its scalability, generalizability, and suitability for industrial-scale QUBO applications.