Evolutionary Systems
DALex: Lexicase-like Selection via Diverse Aggregation
Ni, Andrew, Ding, Li, Spector, Lee
Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are randomly sampled. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its "relaxed" variants, such as epsilon or batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.
Emergence of specialized Collective Behaviors in Evolving Heterogeneous Swarms
van Diggelen, Fuda, De Carlo, Matteo, Cambier, Nicolas, Ferrante, Eliseo, Eiben, A. E.
Natural groups of animals, such as swarms of social insects, exhibit astonishing degrees of task specialization, useful to address complex tasks and to survive. This is supported by phenotypic plasticity: individuals sharing the same genotype that is expressed differently for different classes of individuals, each specializing in one task. In this work, we evolve a swarm of simulated robots with phenotypic plasticity to study the emergence of specialized collective behavior during an emergent perception task. Phenotypic plasticity is realized in the form of heterogeneity of behavior by dividing the genotype into two components, with one different neural network controller associated to each component. The whole genotype, expressing the behavior of the whole group through the two components, is subject to evolution with a single fitness function. We analyse the obtained behaviors and use the insights provided by these results to design an online regulatory mechanism. Our experiments show three main findings: 1) The sub-groups evolve distinct emergent behaviors. 2) The effectiveness of the whole swarm depends on the interaction between the two sub-groups, leading to a more robust performance than with singular sub-group behavior. 3) The online regulatory mechanism enhances overall performance and scalability.
Meta-learning the mirror map in policy mirror descent
Alfano, Carlo, Towers, Sebastian, Sapora, Silvia, Lu, Chris, Rebeschini, Patrick
Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses numerous algorithms. These algorithms are derived through the selection of a mirror map and enjoy finite-time convergence guarantees. Despite its popularity, the exploration of PMD's full potential is limited, with the majority of research focusing on a particular mirror map -- namely, the negative entropy -- which gives rise to the renowned Natural Policy Gradient (NPG) method. It remains uncertain from existing theoretical studies whether the choice of mirror map significantly influences PMD's efficacy. In our work, we conduct empirical investigations to show that the conventional mirror map choice (NPG) often yields less-than-optimal outcomes across several standard benchmark environments. By applying a meta-learning approach, we identify more efficient mirror maps that enhance performance, both on average and in terms of best performance achieved along the training trajectory. We analyze the characteristics of these learned mirror maps and reveal shared traits among certain settings. Our results suggest that mirror maps have the potential to be adaptable across various environments, raising questions about how to best match a mirror map to an environment's structure and characteristics.
A Bandit Approach with Evolutionary Operators for Model Selection
Brégère, Margaux, Keisler, Julie
This paper formulates model selection as an infinite-armed bandit problem. The models are arms, and picking an arm corresponds to a partial training of the model (resource allocation). The reward is the accuracy of the selected model after its partial training. In this best arm identification problem, regret is the gap between the expected accuracy of the optimal model and that of the model finally chosen. We first consider a straightforward generalization of UCB-E to the stochastic infinite-armed bandit problem and show that, under basic assumptions, the expected regret order is $T^{-\alpha}$ for some $\alpha \in (0,1/5)$ and $T$ the number of resources to allocate. From this vanilla algorithm, we introduce the algorithm Mutant-UCB that incorporates operators from evolutionary algorithms. Tests carried out on three open source image classification data sets attest to the relevance of this novel combining approach, which outperforms the state-of-the-art for a fixed budget.
Human-guided Swarms: Impedance Control-inspired Influence in Virtual Reality Environments
Barclay, Spencer, Jerath, Kshitij
As the potential for societal integration of multi-agent robotic systems increases [1], the need to manage the collective behaviors of such systems also increases [2, 3, 4]. There has been significant research effort directed towards the examination of how humans can assist in controlling such collective behaviors, such as in human-swarm interactions [5, 6, 7]. Agent-agent interactions in a swarm of small unmanned aerial systems (sUAS) lead to the emergence of collective behaviors that enable effective coverage and exploration across large spatial extents. However, the same inherent collective behaviors can occasionally limit the ability of the sUAS swarm to focus on specific objects of interest during coverage or exploration missions [8]. In these scenarios, the human operator or supervisor should have the opportunity to fractionally revoke or limit emergent swarm behaviors, and guide the swarm to achieve mission objectives. For most applications, including in industry-and defense-related contexts, such human-swarm interaction (HSI) will likely require intuitive and predictable mechanisms of control to quickly translate the input of the human (such as a gesture) to an influence or effect on the sUAS swarm. The goal of our work is to create an intuitive interface for a human supervisor to influence or guide an sUAS swarm without excessive incursions on decentralized control afforded by these systems, while attempting to create more predictable behaviors. This is a potentially valuable approach that can enable the fully utilization of swarm capabilities, while also retaining an ongoing macroscopic-level of swarm control in scenarios where focus on specific regions of interest is required (e.g., search and rescue, surveillance operations) [9]. The influence mechanism has been implemented and tested using 16 drones in a photo-realistic virtual reality (VR) environment (as shown in Figure 1).
Symbol: Generating Flexible Black-Box Optimizers through Symbolic Equation Learning
Chen, Jiacheng, Ma, Zeyuan, Guo, Hongshu, Ma, Yining, Zhang, Jie, Gong, Yue-Jiao
Recent Meta-learning for Black-Box Optimization (MetaBBO) methods harness neural networks to meta-learn configurations of traditional black-box optimizers. Despite their success, they are inevitably restricted by the limitations of predefined hand-crafted optimizers. In this paper, we present \textsc{Symbol}, a novel framework that promotes the automated discovery of black-box optimizers through symbolic equation learning. Specifically, we propose a Symbolic Equation Generator (SEG) that allows closed-form optimization rules to be dynamically generated for specific tasks and optimization steps. Within \textsc{Symbol}, we then develop three distinct strategies based on reinforcement learning, so as to meta-learn the SEG efficiently. Extensive experiments reveal that the optimizers generated by \textsc{Symbol} not only surpass the state-of-the-art BBO and MetaBBO baselines, but also exhibit exceptional zero-shot generalization abilities across entirely unseen tasks with different problem dimensions, population sizes, and optimization horizons. Furthermore, we conduct in-depth analyses of our \textsc{Symbol} framework and the optimization rules that it generates, underscoring its desirable flexibility and interpretability.
NK Hybrid Genetic Algorithm for Clustering
Tinós, Renato, Zhao, Liang, Chicano, Francisco, Whitley, Darrell
The NK hybrid genetic algorithm for clustering is proposed in this paper. In order to evaluate the solutions, the hybrid algorithm uses the NK clustering validation criterion 2 (NKCV2). NKCV2 uses information about the disposition of $N$ small groups of objects. Each group is composed of $K+1$ objects of the dataset. Experimental results show that density-based regions can be identified by using NKCV2 with fixed small $K$. In NKCV2, the relationship between decision variables is known, which in turn allows us to apply gray box optimization. Mutation operators, a partition crossover, and a local search strategy are proposed, all using information about the relationship between decision variables. In partition crossover, the evaluation function is decomposed into $q$ independent components; partition crossover then deterministically returns the best among $2^q$ possible offspring with computational complexity $O(N)$. The NK hybrid genetic algorithm allows the detection of clusters with arbitrary shapes and the automatic estimation of the number of clusters. In the experiments, the NK hybrid genetic algorithm produced very good results when compared to another genetic algorithm approach and to state-of-art clustering algorithms.
Dual-stage optimizer for systematic overestimation adjustment applied to multi-objective genetic algorithms for biomarker selection
Cattelani, Luca, Fortino, Vittorio
The challenge in biomarker discovery using machine learning from omics data lies in the abundance of molecular features but scarcity of samples. Most feature selection methods in machine learning require evaluating various sets of features (models) to determine the most effective combination. This process, typically conducted using a validation dataset, involves testing different feature sets to optimize the model's performance. Evaluations have performance estimation error and when the selection involves many models the best ones are almost certainly overestimated. Biomarker identification with feature selection methods can be addressed as a multi-objective problem with trade-offs between predictive ability and parsimony in the number of features. Genetic algorithms are a popular tool for multi-objective optimization but they evolve numerous solutions thus are prone to overestimation. Methods have been proposed to reduce the overestimation after a model has already been selected in single-objective problems, but no algorithm existed capable of reducing the overestimation during the optimization, improving model selection, or applied in the more general multi-objective domain. We propose DOSA-MO, a novel multi-objective optimization wrapper algorithm that learns how the original estimation, its variance, and the feature set size of the solutions predict the overestimation. DOSA-MO adjusts the expectation of the performance during the optimization, improving the composition of the solution set. We verify that DOSA-MO improves the performance of a state-of-the-art genetic algorithm on left-out or external sample sets, when predicting cancer subtypes and/or patient overall survival, using three transcriptomics datasets for kidney and breast cancer.
Exponential Auto-Tuning Fault-Tolerant Control of N Degrees-of-Freedom Manipulators Subject to Torque Constraints
Shahna, Mehdi Heydari, Mattila, Jouni
This paper presents a novel auto-tuning subsystem-based fault-tolerant control (SBFC) system designed for robotic manipulator systems with n degrees of freedom (DoF). It initially proposes a novel model for joint torques, incorporating an actuator fault correction model to account for potential faults and a mathematical saturation function to mitigate issues related to unforeseen excessive torque. This model is designed to prevent the generation of excessive torques even by faulty actuators. Subsequently, a robust subsystem-based adaptive control strategy is proposed to force system states closely along desired trajectories, while tolerating various actuator faults, excessive torques, and unknown modeling errors. Furthermore, optimal SBFC gains are determined by tailoring the JAYA algorithm (JA), a high-performance swarm intelligence technique, standing out for its capacity to optimize without the need for meticulous tuning of algorithm-specific parameters, relying instead on its intrinsic principles. Notably, this control framework ensures uniform exponential stability (UES). The enhancement of accuracy and tracking time for reference trajectories, along with the validation of theoretical assertions, is demonstrated through the presentation of simulation outcomes.
Embedding Hardware Approximations in Discrete Genetic-based Training for Printed MLPs
Afentaki, Florentia, Hefenbrock, Michael, Zervakis, Georgios, Tahoori, Mehdi B.
Printed Electronics (PE) stands out as a promisingtechnology for widespread computing due to its distinct attributes, such as low costs and flexible manufacturing. Unlike traditional silicon-based technologies, PE enables stretchable, conformal,and non-toxic hardware. However, PE are constrained by larger feature sizes, making it challenging to implement complex circuits such as machine learning (ML) classifiers. Approximate computing has been proven to reduce the hardware cost of ML circuits such as Multilayer Perceptrons (MLPs). In this paper, we maximize the benefits of approximate computing by integrating hardware approximation into the MLP training process. Due to the discrete nature of hardware approximation, we propose and implement a genetic-based, approximate, hardware-aware training approach specifically designed for printed MLPs. For a 5% accuracy loss, our MLPs achieve over 5x area and power reduction compared to the baseline while outperforming state of-the-art approximate and stochastic printed MLPs.