Goto

Collaborating Authors

 Hart, Emma


A Paired Autoencoder Framework for Inverse Problems via Bayes Risk Minimization

arXiv.org Artificial Intelligence

In this work, we describe a new data-driven approach for inverse problems that exploits technologies from machine learning, in particular autoencoder network structures. We consider a paired autoencoder framework, where two autoencoders are used to efficiently represent the input and target spaces separately and optimal mappings are learned between latent spaces, thus enabling forward and inverse surrogate mappings. We focus on interpretations using Bayes risk and empirical Bayes risk minimization, and we provide various theoretical results and connections to existing works on low-rank matrix approximations. Similar to end-to-end approaches, our paired approach creates a surrogate model for forward propagation and regularized inversion. However, our approach outperforms existing approaches in scenarios where training data for unsupervised learning are readily available but training pairs for supervised learning are scarce. Furthermore, we show that cheaply computable evaluation metrics are available through this framework and can be used to predict whether the solution for a new sample should be predicted well.


Algorithm Selection with Probing Trajectories: Benchmarking the Choice of Classifier Model

arXiv.org Artificial Intelligence

Recent approaches to training algorithm selectors in the black-box optimisation domain have advocated for the use of training data that is'algorithm-centric' in order to encapsulate information about how an algorithm performs on an instance, rather than relying on information derived from features of the instance itself. Probing trajectories that consist of a sequence of objective performance per function evaluation obtained from a short run of an algorithm have recently shown particular promise in training accurate selectors. However, training models on this type of data requires an appropriately chosen classifier given the sequential nature of the data. There are currently no clear guidelines for choosing the most appropriate classifier for algorithm selection using time-series data from the plethora of models available. To address this, we conduct a large benchmark study using 17 different classifiers and three types of trajectory on a classification task using the BBOB benchmark suite using both leave-one-instance out and leave-one-problem out cross-validation. In contrast to previous studies using tabular data, we find that the choice of classifier has a significant impact, showing that feature-based and interval-based models are the best choices.


Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting

arXiv.org Machine Learning

Flow matching has recently emerged as a powerful paradigm for generative modeling and has been extended to probabilistic time series forecasting in latent spaces. However, the impact of the specific choice of probability path model on forecasting performance remains under-explored. In this work, we demonstrate that forecasting spatio-temporal data with flow matching is highly sensitive to the selection of the probability path model. Motivated by this insight, we propose a novel probability path model designed to improve forecasting performance. Our empirical results across various dynamical system benchmarks show that our model achieves faster convergence during training and improved predictive performance compared to existing probability path models. Importantly, our approach is efficient during inference, requiring only a few sampling steps. This makes our proposed model practical for real-world applications and opens new avenues for probabilistic forecasting.


Evaluating the Robustness of Deep-Learning Algorithm-Selection Models by Evolving Adversarial Instances

arXiv.org Artificial Intelligence

Deep neural networks (DNN) are increasingly being used to perform algorithm-selection in combinatorial optimisation domains, particularly as they accommodate input representations which avoid designing and calculating features. Mounting evidence from domains that use images as input shows that deep convolutional networks are vulnerable to adversarial samples, in which a small perturbation of an instance can cause the DNN to misclassify. However, it remains unknown as to whether deep recurrent networks (DRN) which have recently been shown promise as algorithm-selectors in the bin-packing domain are equally vulnerable. We use an evolutionary algorithm (EA) to find perturbations of instances from two existing benchmarks for online bin packing that cause trained DRNs to misclassify: adversarial samples are successfully generated from up to 56% of the original instances depending on the dataset. Analysis of the new misclassified instances sheds light on the `fragility' of some training instances, i.e. instances where it is trivial to find a small perturbation that results in a misclassification and the factors that influence this. Finally, the method generates a large number of new instances misclassified with a wide variation in confidence, providing a rich new source of training data to create more robust models.


Paired Autoencoders for Inverse Problems

arXiv.org Artificial Intelligence

We consider the solution of nonlinear inverse problems where the forward problem is a discretization of a partial differential equation. Such problems are notoriously difficult to solve in practice and require minimizing a combination of a data-fit term and a regularization term. The main computational bottleneck of typical algorithms is the direct estimation of the data misfit. Therefore, likelihood-free approaches have become appealing alternatives. Nonetheless, difficulties in generalization and limitations in accuracy have hindered their broader utility and applicability. In this work, we use a paired autoencoder framework as a likelihood-free estimator for inverse problems. We show that the use of such an architecture allows us to construct a solution efficiently and to overcome some known open problems when using likelihood-free estimators. In particular, our framework can assess the quality of the solution and improve on it if needed. We demonstrate the viability of our approach using examples from full waveform inversion and inverse electromagnetic imaging.


Improving Algorithm-Selection and Performance-Prediction via Learning Discriminating Training Samples

arXiv.org Artificial Intelligence

The choice of input-data used to train algorithm-selection models is recognised as being a critical part of the model success. Recently, feature-free methods for algorithm-selection that use short trajectories obtained from running a solver as input have shown promise. However, it is unclear to what extent these trajectories reliably discriminate between solvers. We propose a meta approach to generating discriminatory trajectories with respect to a portfolio of solvers. The algorithm-configuration tool irace is used to tune the parameters of a simple Simulated Annealing algorithm (SA) to produce trajectories that maximise the performance metrics of ML models trained on this data. We show that when the trajectories obtained from the tuned SA algorithm are used in ML models for algorithm-selection and performance prediction, we obtain significantly improved performance metrics compared to models trained both on raw trajectory data and on exploratory landscape features.


An Investigation of the Factors Influencing Evolutionary Dynamics in the Joint Evolution of Robot Body and Control

arXiv.org Artificial Intelligence

In evolutionary robotics, jointly optimising the design and the controller of robots is a challenging task due to the huge complexity of the solution space formed by the possible combinations of body and controller. We focus on the evolution of robots that can be physically created rather than just simulated, in a rich morphological space that includes a voxel-based chassis, wheels, legs and sensors. On the one hand, this space offers a high degree of liberty in the range of robots that can be produced, while on the other hand introduces a complexity rarely dealt with in previous works relating to matching controllers to designs and in evolving closed-loop control. This is usually addressed by augmenting evolution with a learning algorithm to refine controllers. Although several frameworks exist, few have studied the role of the \textit{evolutionary dynamics} of the intertwined `evolution+learning' processes in realising high-performing robots. We conduct an in-depth study of the factors that influence these dynamics, specifically: synchronous vs asynchronous evolution; the mechanism for replacing parents with offspring, and rewarding goal-based fitness vs novelty via selection. Results show that asynchronicity combined with goal-based selection and a `replace worst' strategy results in the highest performance.


Understanding fitness landscapes in morpho-evolution via local optima networks

arXiv.org Artificial Intelligence

Morpho-evolution (ME) refers to the simultaneous optimisation of a robot's design and controller to maximise performance given a task and environment. Many genetic encodings have been proposed which are capable of representing design and control. Previous research has provided empirical comparisons between encodings in terms of their performance with respect to an objective function and the diversity of designs that are evaluated, however there has been no attempt to explain the observed findings. We address this by applying Local Optima Network (LON) analysis to investigate the structure of the fitness landscapes induced by three different encodings when evolving a robot for a locomotion task, shedding new light on the ease by which different fitness landscapes can be traversed by a search process. This is the first time LON analysis has been applied in the field of ME despite its popularity in combinatorial optimisation domains; the findings will facilitate design of new algorithms or operators that are customised to ME landscapes in the future.


On the Utility of Probing Trajectories for Algorithm-Selection

arXiv.org Artificial Intelligence

Machine-learning approaches to algorithm-selection typically take data describing an instance as input. Input data can take the form of features derived from the instance description or fitness landscape, or can be a direct representation of the instance itself, i.e. an image or textual description. Regardless of the choice of input, there is an implicit assumption that instances that are similar will elicit similar performance from algorithm, and that a model is capable of learning this relationship. We argue that viewing algorithm-selection purely from an instance perspective can be misleading as it fails to account for how an algorithm `views' similarity between instances. We propose a novel `algorithm-centric' method for describing instances that can be used to train models for algorithm-selection: specifically, we use short probing trajectories calculated by applying a solver to an instance for a very short period of time. The approach is demonstrated to be promising, providing comparable or better results to computationally expensive landscape-based feature-based approaches. Furthermore, projecting the trajectories into a 2-dimensional space illustrates that functions that are similar from an algorithm-perspective do not necessarily correspond to the accepted categorisation of these functions from a human perspective.


Generalized Early Stopping in Evolutionary Direct Policy Search

arXiv.org Artificial Intelligence

Evolutionary algorithms (EAs) are increasingly being using in applications such as computer games [De Souza, 2014, Hastings et al., 2009] and robotics [Hoffmann, 2001, Fleming and Purshouse, 2002] to learn control algorithms (policies), as well as being applied to classic control tasks such as the benchmark suites available in OpenAi Gym [Brockman et al., 2016]. Often direct policy search algorithms such as EAs require a large number of evaluations: when these evaluations are costly in terms of time, this can result in extremely long learning times, which can be prohibitive in the worst case. Unfortunately many applications of interest suffer from this problem. For example, the protein folding problem [Dill et al., 2008] requires costly simulations, while applications that involve a double optimization process are also considered very computationally costly. This includes for example the joint optimization of robot morphology and control [Hart and Le Goff, 2022, Le Goff et al., 2021] in simulation (which typically use an outer loop to evolve body-plans and a nested inner-loop to evolve control), nested combinatorial optimization problems[Wu et al., 2021, Kobeaga et al., 2021] or hyperparameter optimization [De Souza et al., 2022]. Specifically in robotics, evaluations that need to be conducted directly on a physical robot to avoid any reality-gap tend to be very time-consuming, while repeating lengthy evaluations also places considerable wear and tear on machinery, potentially leading to unreliable objective-function values.