Goto

Collaborating Authors

 Optimization


A Scalable Global Optimization Algorithm For Constrained Clustering

arXiv.org Artificial Intelligence

Constrained clustering leverages limited domain knowledge to improve clustering performance and interpretability, but incorporating pairwise must-link and cannot-link constraints is an NP-hard challenge, making global optimization intractable. Existing mixed-integer optimization methods are confined to small-scale datasets, limiting their utility. We propose Sample-Driven Constrained Group-Based Branch-and-Bound (SDC-GBB), a decomposable branch-and-bound (BB) framework that collapses must-linked samples into centroid-based pseudo-samples and prunes cannot-link through geometric rules, while preserving convergence and guaranteeing global optimality. By integrating grouped-sample Lagrangian decomposition and geometric elimination rules for efficient lower and upper bounds, the algorithm attains highly scalable pairwise k-Means constrained clustering via parallelism. Experimental results show that our approach handles datasets with 200,000 samples with cannot-link constraints and 1,500,000 samples with must-link constraints, which is 200 - 1500 times larger than the current state-of-the-art under comparable constraint settings, while reaching an optimality gap of less than 3%. In providing deterministic global guarantees, our method also avoids the search failures that off-the-shelf heuristics often encounter on large datasets.


Learning Local Stackelberg Equilibria from Repeated Interactions with a Learning Agent

arXiv.org Artificial Intelligence

Motivated by the question of how a principal can maximize its utility in repeated interactions with a learning agent, we study repeated games between an principal and an agent employing a mean-based learning algorithm. Prior work has shown that computing or even approximating the global Stackelberg value in similar settings can require an exponential number of rounds in the size of the agent's action space, making it computationally intractable. In contrast, we shift focus to the computation of local Stackelberg equilibria and introduce an algorithm that, within the smoothed analysis framework, constitutes a Polynomial Time Approximation Scheme (PTAS) for finding an epsilon-approximate local Stackelberg equilibrium. Notably, the algorithm's runtime is polynomial in the size of the agent's action space yet exponential in (1/epsilon) - a dependency we prove to be unavoidable.


Graph-Coarsening Approach for the Capacitated Vehicle Routing Problem with Time Windows

arXiv.org Artificial Intelligence

H(x) includes large-weight penalty terms for each constraint (visiting each customer once, capacity limit, time windows, depot start/return, etc.). The result is a quadratic binary optimization problem whose ground-state solution corresponds to an optimal or near-optimal set of routes for the CVRPTW [14]. Prior works have demonstrated how to construct such QUBO models for vehicle routing problems. For instance, Papal-itsas et al. [14] developed a QUBO model for the Traveling Salesman Problem with Time Windows, encoding time window constraints with binary variables and penalties. Likewise, recent studies have formulated capacitated VRPs as QUBO by embedding route feasibility conditions into the objective function [8]. Although formulating the CVRPTW as a QUBO is complex - especially due to the time window inequalities - it enables the use of quantum annealers and digital annealing hardware to search for high-quality solutions by minimizing the combined objective [10].


Probing Neural Combinatorial Optimization Models

arXiv.org Artificial Intelligence

Neural combinatorial optimization (NCO) has achieved remarkable performance, yet its learned model representations and decision rationale remain a black box. This impedes both academic research and practical deployment, since researchers and stakeholders require deeper insights into NCO models. In this paper, we take the first critical step towards interpreting NCO models by investigating their representations through various probing tasks. Moreover, we introduce a novel probing tool named Coefficient Significance Probing (CS-Probing) to enable deeper analysis of NCO representations by examining the coefficients and statistical significance during probing. Extensive experiments and analysis reveal that NCO models encode low-level information essential for solution construction, while capturing high-level knowledge to facilitate better decisions. Using CS-Probing, we find that prevalent NCO models impose varying inductive biases on their learned representations, uncover direct evidence related to model generalization, and identify key embedding dimensions associated with specific knowledge. These insights can be potentially translated into practice, for example, with minor code modifications, we improve the generalization of the analyzed model. Our work represents a first systematic attempt to interpret black-box NCO models, showcasing probing as a promising tool for analyzing their internal mechanisms and revealing insights for the NCO community. The source code is publicly available.


Efficient Utility-Preserving Machine Unlearning with Implicit Gradient Surgery

arXiv.org Artificial Intelligence

Machine unlearning (MU) aims to efficiently remove sensitive or harmful memory from a pre-trained model. The key challenge is to balance the potential tradeoff between unlearning efficacy and utility preservation, which involves forgetting undesirable information as defined while maintaining the model's original performance. One potential way to tackle this problem is to use multi-objective optimization to jointly optimize both the unlearning and utility preservation objectives. However, existing multi-objective methods only guarantee finding a Pareto-optimal solution without fine-grained control, which causes under-optimization of the unlearning objective. To this end, we first model MU as a constrained optimization problem, that is, optimizing the unlearning objective under the constraint of a bounded increase for utility loss. We then show that solving this optimization problem is equivalent to unilateral gradient surgery on the unlearning objective. To resolve the additional computational cost brought by gradient surgery, we propose an implicit gradient surgery method, which approximates the solution to the aforementioned constrained optimization problem via only one backpropagation, thereby achieving efficient utility-preserving MU. Theoretically, we provide a tight convergence analysis of the algorithm. Empirically, our extensive experiments show that the proposed algorithm achieves better tradeoff results than existing baselines. Codes are available at https://github.com/anseryuer/EUPMU-Efficient-Utility-Preserving-Machine-Unlearning.


When UAV Swarm Meets IRS: Collaborative Secure Communications in Low-altitude Wireless Networks

arXiv.org Artificial Intelligence

Abstract--Low-altitude wireless networks (LA WNs) represent a promising architecture that integrates unmanned aerial vehicles (UA Vs) as aerial nodes to provide enhanced coverage, reliability, and throughput for diverse applications. However, these networks face significant security vulnerabilities from both known and potential unknown eavesdroppers, which may threaten data confidentiality and system integrity. T o solve this critical issue, we propose a novel secure communication framework for LA WNs where the selected UA Vs within a swarm function as a virtual antenna array (V AA), complemented by intelligent reflecting surface (IRS) to create a robust defense against eavesdropping attacks. Specifically, we formulate a multi-objective optimization problem that simultaneously maximizes the secrecy rate while minimizing the maximum sidelobe level and total energy consumption, requiring joint optimization of UA V excitation current weights, flight trajectories, and IRS phase shifts. This problem presents significant difficulties due to the dynamic nature of the system and heterogeneous components. Thus, we first transform the problem into a heterogeneous Markov decision process (MDP). Then, we propose a heterogeneous multi-agent control approach (HMCA) that integrates a dedicated IRS control policy with a multi-agent soft actor-critic framework for UA V control, which enables coordinated operation across heterogeneous network elements. Simulation results show that the proposed HMCA achieves superior performance compared to baseline approaches in terms of secrecy rate improvement, sidelobe suppression, and energy efficiency. Furthermore, we find that the collaborative and passive beamforming synergy between V AA and IRS creates robust security guarantees when the number of UA Vs increases. Jiahui Li, Xinyue Liang, and Hui Kang are with the College of Computer Science and Technology, Jilin University, Changchun 130012, China (E-mails: lijiahui@jlu.edu.cn; Geng Sun is with the College of Computer Science and Technology, Jilin University, Changchun 130012, China, and also with the Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China. He is also with the College of Computing and Data Science, Nanyang Technological University, Singapore 639798 (E-mail: sungeng@jlu.edu.cn).


STAR-RIS-assisted Collaborative Beamforming for Low-altitude Wireless Networks

arXiv.org Artificial Intelligence

Abstract--While low-altitude wireless networks (LA WNs) based on uncrewed aerial vehicles (UA Vs) offer high mobility, flexibility, and coverage for urban communications, they face severe signal attenuation in dense environments due to obstructions. T o address this critical issue, we consider introducing collaborative beamforming (CB) of UA Vs and omnidirectional reconfigurable beamforming (ORB) of simultaneous transmitting and reflecting reconfigurable intelligent surfaces (ST AR-RIS) to enhance the signal quality and directionality. On this basis, we formulate a joint rate and energy optimization problem (JREOP) to maximize the transmission rate of the overall system, while minimizing the energy consumption of the UA V swarm. Due to the non-convex and NP-hard nature of JREOP, we propose a heterogeneous multi-agent collaborative dynamic (HMCD) optimization framework, which has two core components. The first component is a simulated annealing (SA)-based ST AR-RIS control method, which dynamically optimizes reflection and transmission coefficients to enhance signal propagation. The second component is an improved multi-agent deep reinforcement learning (MADRL) control method, which incorporates a self-attention evaluation mechanism to capture interactions between UA Vs and an adaptive velocity transition mechanism to enhance training stability. Simulation results demonstrate that HMCD outperforms various baselines in terms of convergence speed, average transmission rate, and energy consumption. Further analysis reveals that the average transmission rate of the overall system scales positively with both UA V count and ST AR-RIS element numbers. Index T erms--UA V, ST AR-RIS, secure communications, collaborative beamforming, multi-agent deep reinforcement learning. Xinyue Liang, Hui Kang, Junwei Che, and Jiahui Li are with the College of Computer Science and Technology, Jilin University, Changchun 130012, China (e-mails: xyliang25@mails.jlu.edu.cn; Geng Sun is with the College of Computer Science and Technology, Jilin University, Changchun 130012, China, and with Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; he is also affiliated with the College of Computing and Data Science, Nanyang Technological University, Singapore 639798 (e-mail: sungeng@jlu.edu.cn).


FlowOpt: Fast Optimization Through Whole Flow Processes for Training-Free Editing

arXiv.org Artificial Intelligence

The remarkable success of diffusion and flow-matching models has ignited a surge of works on adapting them at test time for controlled generation tasks. Examples range from image editing to restoration, compression and personalization. However, due to the iterative nature of the sampling process in those models, it is computationally impractical to use gradient-based optimization to directly control the image generated at the end of the process. As a result, existing methods typically resort to manipulating each timestep separately. Here we introduce FlowOpt - a zero-order (gradient-free) optimization framework that treats the entire flow process as a black box, enabling optimization through the whole sampling path without backpropagation through the model. Our method is both highly efficient and allows users to monitor the intermediate optimization results and perform early stopping if desired. We prove a sufficient condition on FlowOpt's step-size, under which convergence to the global optimum is guaranteed. We further show how to empirically estimate this upper bound so as to choose an appropriate step-size. We demonstrate how FlowOpt can be used for image editing, showcasing two options: (i) inversion (determining the initial noise that generates a given image), and (ii) directly steering the edited image to be similar to the source image while conforming to a target text prompt. In both cases, FlowOpt achieves state-of-the-art results while using roughly the same number of neural function evaluations (NFEs) as existing methods. Code and examples are available on the project's webpage.


Structure-Aware Cooperative Ensemble Evolutionary Optimization on Combinatorial Problems with Multimodal Large Language Models

arXiv.org Artificial Intelligence

Evolutionary algorithms (EAs) have proven effective in exploring the vast solution spaces typical of graph-structured combinatorial problems. However, traditional encoding schemes, such as binary or numerical representations, often fail to straightforwardly capture the intricate structural properties of networks. Through employing the image-based encoding to preserve topological context, this study utilizes multimodal large language models (MLLMs) as evolutionary operators to facilitate structure-aware optimization over graph data. To address the visual clutter inherent in large-scale network visualizations, we leverage graph sparsification techniques to simplify structures while maintaining essential structural features. To further improve robustness and mitigate bias from different sparsification views, we propose a cooperative evolutionary optimization framework that facilitates cross-domain knowledge transfer and unifies multiple sparsified variants of diverse structures. Additionally, recognizing the sensitivity of MLLMs to network layout, we introduce an ensemble strategy that aggregates outputs from various layout configurations through consensus voting. Finally, experiments on real-world networks through various tasks demonstrate that our approach improves both the quality and reliability of solutions in MLLM-driven evolutionary optimization.


An AI enhanced approach to the tree unimodality conjecture

arXiv.org Artificial Intelligence

Given a graph $G$, its independence sequence is the integral sequence $a_1,a_2,...,a_n$, where $a_i$ is the number of independent sets of vertices of size i. In the late 80's Alavi, Erdos, Malde, Schwenk showed that this sequence need not be unimodal for general graphs, but conjectured that it is always unimodal whenever $G$ is a tree. This conjecture was then naturally generalized to claim that the independence sequence of trees should be log concave, in the sense that $a_i^2$ is always above $a_{i-1}a_{i+1}$. This conjecture stood for many years, until in 2023, Kadrawi, Levit, Yosef, and Mizrachi proved that there were exactly two trees on 26 vertices whose independence sequence was not log concave. In this paper, we use the AI architecture PatternBoost, developed by Charton, Ellenberg, Wagner, and Williamson to train a machine to find counter-examples to the log-concavity conjecture. We will discuss the successes of this approach - finding tens of thousands of new counter-examples to log-concavity with vertex set sizes varying from 27 to 101 - and some of its fascinating failures.