Goto

Collaborating Authors

 Evolutionary Systems


CORE: Collaborative Optimization with Reinforcement Learning and Evolutionary Algorithm for Floorplanning

Neural Information Processing Systems

Floorplanning is the initial step in the physical design process of Electronic Design Automation (EDA), directly influencing subsequent placement, routing, and final power of the chip. However, the solution space in floorplanning is vast, and current algorithms often struggle to explore it sufficiently, making them prone to getting trapped in local optima. To achieve efficient floorplanning, we propose CORE, a general and effective solution optimization framework that synergizes Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) for high-quality layout search and optimization. Specifically, we propose the Clustering-based Diversified Evolutionary Search that directly perturbs layouts and evolves them based on novelty and performance. Additionally, we model the floorplanning problem as a sequential decision problem with B*-Tree representation and employ RL for efficient learning.


PROSPERO: Active Learning for Robust Protein Design Beyond Wild-Type Neighborhoods

Neural Information Processing Systems

Designing protein sequences of both high fitness and novelty is a challenging task in data-efficient protein engineering. Exploration beyond wild-type neighborhoods often leads to biologically implausible sequences or relies on surrogate models that lose fidelity in novel regions. Here, we propose PROSPERO, an active learning framework in which a frozen pre-trained generative model is guided by a surrogate updated from oracle feedback. By integrating fitness-relevant residue selection with biologically-constrained Sequential Monte Carlo sampling, our approach enables exploration beyond wild-type neighborhoods while preserving biological plausibility. We show that our framework remains effective even when the surrogate is misspecified. PROSPERO consistently outperforms or matches existing methods across diverse protein engineering tasks, retrieving sequences of both high fitness and novelty.


Theoretical Guarantees for the Retention of Strict Nash Equilibria by Coevolutionary Algorithms

Neural Information Processing Systems

Most methods for finding a Nash equilibrium rely on procedures that operate over the entire action space, making them infeasible for settings with too many actions to be searched exhaustively. Randomised search heuristics such as coevolutionary algorithms offer benefits in such settings, however they lack many of the theoretical guarantees established for exhaustive methods such as zero-regret learning. We address this by developing a method for proving necessary and sufficient conditions for a coevolutionary algorithm to be stable, in the sense that it reliably retains a Nash equilibrium following discovery. As the method provides bounds that are adapted to both application and algorithm instance, it can be used as a practical tool for parameter configuration. We additionally show how bounds on regret may be deduced from our results and undertake corresponding empirical analysis.


Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble

Neural Information Processing Systems

Many real-world black-box optimization problems have multiple conflicting objectives. Rather than attempting to approximate the entire set of Pareto-optimal solutions, interactive preference learning, i.e., optimization with a decision maker in the loop, allows us to focus the search on the most relevant subset. However, few previous studies have exploited the fact that utility functions are typically monotonic. In this paper, we address the Bayesian Optimization with Preference Exploration (BOPE) problem and propose using a neural network ensemble as a utility surrogate model. This approach naturally integrates monotonicity and allows learning the decision maker's preferences from pairwise comparisons. Our experiments demonstrate that the proposed method outperforms state-of-the-art approaches and is robust to noise in utility evaluations. An ablation study highlights the critical role of monotonicity in enhancing performance.


Why Popular MOEAs Are Popular: Proven Advantages in Approximating the Pareto Front

Neural Information Processing Systems

Recent breakthroughs in the analysis of multi-objective evolutionary algorithms (MOEAs) are mathematical runtime analyses of those algorithms which are intensively used in practice. So far, most of these results show the same performance as previously known for simpler algorithms like the GSEMO. The few results indicating advantages of the popular MOEAs share the same shortages: They only consider the problem of computing the full Pareto front, sometimes of algorithms enriched with newly invented mechanisms, and this on newly designed benchmarks. In this work, we overcome these shortcomings by analyzing how existing popular MOEAs approximate the Pareto front of the established LARGEFRONT benchmark. We prove that several popular MOEAs, including NSGA-II (with current crowding distance), NSGA-III, SMS-EMOA, and SPEA2, only need an expected time of O(n2 logn) fitness evaluations to compute an additive ฯ‰-approximation of the Pareto front of the LARGEFRONT benchmark. This contrasts with the already proven exponential runtime (with high probability) of the GSEMO on the same task. Our result is the first mathematical runtime analysis showing and explaining the superiority of popular MOEAs over simple ones like the GSEMO for the central task of computing good approximations to the Pareto front.


HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Neural Information Processing Systems

Model merging is a technique that combines multiple large pretrained models into a single model, enhancing performance and broadening task adaptability without original data or additional training. However, most existing model merging methods focus primarily on exploring the parameter space, merging models with identical architectures. Despite its potential, merging in the architecture space remains in its early stages due to the vast search space and challenges related to layer compatibility. This paper designs a hierarchical model merging framework named HM3, formulating a bilevel multi-objective model merging problem across both parameter and architecture spaces. At the parameter level, HM3 integrates existing merging methods to quickly identify optimal parameters. Based on these, an actorcritic strategy with efficient policy discretization is employed at the architecture level to explore inference paths with Markov property in the layer-granularity search space for reconstructing these optimal models. By training reusable policy and value networks, HM3 learns Pareto optimal models to provide customized solutions for various tasks. Experimental results on language and vision tasks demonstrate that HM3 outperforms methods focusing solely on the parameter or architecture space.


IMPACT: Irregular Multi-Patch Adversarial Composition Based on Two-Phase Optimization

Neural Information Processing Systems

Deep neural networks have become foundational in various applications but remain vulnerable to adversarial patch attacks. Crafting effective adversarial patches is inherently challenging due to the combinatorial complexity involved in jointly optimizing critical factors such as patch shape, location, number, and content. Existing approaches often simplify this optimization by addressing each factor independently, which limits their effectiveness. To tackle this significant challenge, we introduce a novel and flexible adversarial attack framework termed IMPACT (Irregular Multi-Patch Adversarial Composition based on Two-phase optimization). IMPACT uniquely enables comprehensive optimization of all essential patch factors using gradient-free methods. Specifically, we propose a novel dimensionality reduction encoding scheme that substantially lowers computational complexity while preserving expressive power. Leveraging this encoding, we further develop a two-phase optimization framework: phase 1 employs differential evolution for joint optimization of patch mask and content, while phase 2 refines patch content using an evolutionary strategy for enhanced precision. Additionally, we introduce a new aggregation algorithm explicitly designed to produce contiguous, irregular patches by merging localized regions, ensuring physical applicability. Extensive experiments demonstrate that our method significantly outperforms several state-of-the-art approaches, highlighting the critical benefit of jointly optimizing all patch factors in adversarial patch attacks.


Why Playing Against Diverse and Challenging Opponents Speeds Up Coevolution: ATheoretical Analysis on Combinatorial Games

Neural Information Processing Systems

Competitive coevolutionary algorithms (CoEAs) have a natural application to problems that are adversarial or feature strategic interaction. However, there is currently limited theoretical insight into how to avoid pathological behaviour associated with CoEAs. In this paper we use impartial combinatorial games as a challenging domain for CoEAs and provide a corresponding runtime analysis. By analysing how individuals capitalise on the mistakes of their opponents, we prove that the Univariate Marginal Distribution Algorithm finds (with high probability) an optimal strategy for a game called Reciprocal LeadingOnes within O(n2 log3 n)game evaluations, a significant improvement over the best known bound of O(n5 log2 n). Critical to the analysis is the introduction of a novel stabilising operator, the impact of which we study both theoretically and empirically.


Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning

Neural Information Processing Systems

Neural Combinatorial Optimization (NCO) has emerged as a promising learningbased paradigm for addressing Vehicle Routing Problems (VRPs) by minimizing the need for extensive manual engineering. While existing NCO methods, trained on small-scale instances (e.g., 100 nodes), have demonstrated considerable success on problems of similar scale, their performance significantly degrades when applied to large-scale scenarios. This degradation arises from the distributional shift between training and testing data, rendering policies learned on small instances ineffective for larger problems. To overcome this limitation, we introduce a novel learning framework driven by Large Language Models (LLMs). This framework learns a projection between the training and testing distributions, which is then deployed to enhance the scalability of the NCO model. Notably, unlike prevailing techniques that necessitate joint training with the neural network, our approach operates exclusively during the inference phase, obviating the need for model retraining. Extensive experiments demonstrate that our method enables a backbone model (trained on 100-node instances) to achieve superior performance on large-scale Traveling Salesman Problems (TSPs) and Capacitated Vehicle Routing Problems (CVRPs) with up to 100K nodes from diverse distributions.


OriginalImageMaskFold 1Fold 2Fold 3Fold 4Fold 5IdealSplitRandomSplit

Neural Information Processing Systems

Random splitting of datasets in image segmentation often leads to unrepresentative test sets, resulting in biased evaluations and poor model generalization. While stratified sampling has proven effective for addressing label distribution imbalance in classification tasks, extending these ideas to segmentation remains challenging due to the multi-label structure and class imbalance typically present in such data. Building on existing stratification concepts, we introduce Iterative Pixel Stratification (IPS), a straightforward, label-aware sampling method tailored for segmentation tasks. Additionally, we present Wasserstein-Driven Evolutionary Stratification (WDES), a novel genetic algorithm designed to minimize the Wasserstein distance, thereby optimizing the similarity of label distributions across dataset splits. We prove that WDES is globally optimal given enough generations. Using newly proposed statistical heterogeneity metrics, we evaluate both methods against random sampling and find that WDES consistently produces more representative splits. Applying WDES across diverse segmentation tasks, including street scenes, medical imaging, and satellite imagery, leads to lower performance variance and improved model evaluation. Our results also highlight the particular value of WDES in handling small, imbalanced, and low-diversity datasets, where conventional splitting strategies are most prone to bias.