Chan, Jeffrey
Learning Enhanced Optimisation for Routing Problems
Sultana, Nasrin, Chan, Jeffrey, Sarwar, Tabinda, Abbasi, Babak, Qin, A. K.
Deep learning approaches have shown promising results in solving routing problems. However, there is still a substantial gap in solution quality between machine learning and operations research algorithms. Recently, another line of research has been introduced that fuses the strengths of machine learning and operational research algorithms. In particular, search perturbation operators have been used to improve the solution. Nevertheless, using the perturbation may not guarantee a quality solution. This paper presents "Learning to Guide Local Search" (L2GLS), a learning-based approach for routing problems that uses a penalty term and reinforcement learning to adaptively adjust search efforts. L2GLS combines local search (LS) operators' strengths with penalty terms to escape local optimals. Routing problems have many practical applications, often presetting larger instances that are still challenging for many existing algorithms introduced in the learning to optimise field. We show that L2GLS achieves the new state-of-the-art results on larger TSP and CVRP over other machine learning methods.
Parallelizing Contextual Linear Bandits
Chan, Jeffrey, Pacchiano, Aldo, Tripuraneni, Nilesh, Song, Yun S., Bartlett, Peter, Jordan, Michael I.
Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions. However, \textit{simultaneously} proposing a batch of decisions, which leverages available resources for parallel experimentation, has the potential to rapidly accelerate exploration. We present a family of (parallel) contextual linear bandit algorithms, whose regret is nearly identical to their perfectly sequential counterparts -- given access to the same total number of oracle queries -- up to a lower-order "burn-in" term that is dependent on the context-set geometry. We provide matching information-theoretic lower bounds on parallel regret performance to establish our algorithms are asymptotically optimal in the time horizon. Finally, we also present an empirical evaluation of these parallel algorithms in several domains, including materials discovery and biological sequence design problems, to demonstrate the utility of parallelized bandits in practical settings.
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize
Guler, Ali Ugur, Demirovic, Emir, Chan, Jeffrey, Bailey, James, Leckie, Christopher, Stuckey, Peter J.
Divide and Learn: A Divide and Conquer Approach for Predict Optimize Authors Ali Ugur Guler, 1 Emir Demirovic, 2 Jeffrey Chan, 3 James Bailey, 1 Christopher Leckie, 1 Peter J. Stuckey, 4 1 University of Melbourne, 2 Delft University of Technology, 3 RMIT University, 4 Monash University aguler@student.unimelb.edu.au, Abstract The predict optimize problem combines machine learning of problem coefficients with a combinatorial optimization problem that uses the predicted coefficients. While this problem can be solved in two separate stages, it is better to directly minimize the optimization loss. However, this requires differentiating through a discrete, non-differentiable combinatorial function. Most existing approaches use some form of surrogate gradient. Demirovic et al showed how to directly express the loss of the optimization problem in terms of the predicted coefficients as a piece-wise linear function. However, their approach is restricted to optimization problems with a dynamic programming formulation. In this work we propose a novel divide and conquer algorithm to tackle optimization problems without this restriction and predict its coefficients using the optimization loss. We also introduce a greedy version of this approach, which achieves similar results with less computation. We compare our approach with other approaches to the predict optimize problem and show we can successfully tackle some hard combinatorial problems better than other predict optimize methods. Introduction Machine Learning ( ML) has gained substantial attention in the last decade, and has proven to be useful in a wide range of industries. ML models usually focus on making accurate predictions by minimizing errors, such as mean squared error ( MSE). These predictions can then be used as coefficients in other decision making processes, such as a combinatorial optimization problem.
Representing and Denoising Wearable ECG Recordings
Chan, Jeffrey, Miller, Andrew C., Fox, Emily B.
Modern wearable devices are embedded with a range of noninvasive biomarker sensors that hold promise for improving detection and treatment of disease. One such sensor is the single-lead electrocardiogram (ECG) which measures electrical signals in the heart. The benefits of the sheer volume of ECG measurements with rich longitudinal structure made possible by wearables come at the price of potentially noisier measurements compared to clinical ECGs, e.g., due to movement. In this work, we develop a statistical model to simulate a structured noise process in ECGs derived from a wearable sensor, design a beat-to-beat representation that is conducive for analyzing variation, and devise a factor analysis-based method to denoise the ECG. We study synthetic data generated using a realistic ECG simulator and a structured noise model. At varying levels of signal-to-noise, we quantitatively measure an upper bound on performance and compare estimates from linear and non-linear models. Finally, we apply our method to a set of ECGs collected by wearables in a mobile health study.
Learning to Optimise General TSP Instances
Sultana, Nasrin, Chan, Jeffrey, Qin, A. K., Sarwar, Tabinda
The Travelling Salesman Problem (TSP) is a classical combinatorial optimisation problem. Deep learning has been successfully extended to meta-learning, where previous solving efforts assist in learning how to optimise future optimisation instances. In recent years, learning to optimise approaches have shown success in solving TSP problems. However, they focus on one type of TSP problem, namely ones where the points are uniformly distributed in Euclidean spaces and have issues in generalising to other embedding spaces, e.g., spherical distance spaces, and to TSP instances where the points are distributed in a non-uniform manner. An aim of learning to optimise is to train once and solve across a broad spectrum of (TSP) problems. Although supervised learning approaches have shown to achieve more optimal solutions than unsupervised approaches, they do require the generation of training data and running a solver to obtain solutions to learn from, which can be time-consuming and difficult to find reasonable solutions for harder TSP instances. Hence this paper introduces a new learning-based approach to solve a variety of different and common TSP problems that are trained on easier instances which are faster to train and are easier to obtain better solutions. We name this approach the non-Euclidean TSP network (NETSP-Net). The approach is evaluated on various TSP instances using the benchmark TSPLIB dataset and popular instance generator used in the literature. We performed extensive experiments that indicate our approach generalises across many types of instances and scales to instances that are larger than what was used during training.
MurTree: Optimal Classification Trees via Dynamic Programming and Search
Demiroviฤ, Emir, Lukina, Anna, Hebrard, Emmanuel, Chan, Jeffrey, Bailey, James, Leckie, Christopher, Ramamohanarao, Kotagiri, Stuckey, Peter J.
Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy, size, and other considerations such as fairness. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes and we argue it can be extended with other requirements. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees.
Alternative Blockmodelling
Correa, Oscar, Chan, Jeffrey, Nguyen, Vinh
Many approaches have been proposed to discover clusters within networks. Community finding field encompasses approaches which try to discover clusters where nodes are tightly related within them but loosely related with nodes of other clusters. However, a community network configuration is not the only possible latent structure in a graph. Core-periphery and hierarchical network configurations are valid structures to discover in a relational dataset. On the other hand, a network is not completely explained by only knowing the membership of each node. A high level view of the inter-cluster relationships is needed. Blockmodelling techniques deal with these two issues. Firstly, blockmodelling allows finding any network configuration besides to the well-known community structure. Secondly, blockmodelling is a summary representation of a network which regards not only membership of nodes but also relations between clusters. Finally, a unique summary representation of a network is unlikely. Networks might hide more than one blockmodel. Therefore, our proposed problem aims to discover a secondary blockmodel representation of a network that is of good quality and dissimilar with respect to a given blockmodel. Our methodology is presented through two approaches, (a) inclusion of cannot-link constraints and (b) dissimilarity between image matrices. Both approaches are based on non-negative matrix factorisation NMF which fits the blockmodelling representation. The evaluation of these two approaches regards quality and dissimilarity of the discovered alternative blockmodel as these are the requirements of the problem.
Approximating Optimisation Solutions for Travelling Officer Problem with Customised Deep Learning Network
Shao, Wei, Salim, Flora D., Chan, Jeffrey, Morrison, Sean, Zambetta, Fabio
Deep learning has been extended to a number of new domains with critical success, though some traditional orienteering problems such as the Travelling Salesman Problem (TSP) and its variants are not commonly solved using such techniques. Deep neural networks (DNNs) are a potentially promising and under-explored solution to solve these problems due to their powerful function approximation abilities, and their fast feed-forward computation. In this paper, we outline a method for converting an orienteering problem into a classification problem, and design a customised multi-layer deep learning network to approximate traditional optimisation solutions to this problem. We test the performance of the network on a real-world parking violation dataset, and conduct a generic study that empirically shows the critical architectural components that affect network performance for this problem.
A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks
Chan, Jeffrey, Perrone, Valerio, Spence, Jeffrey, Jenkins, Paul, Mathieson, Sara, Song, Yun
An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.
A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks
Chan, Jeffrey, Perrone, Valerio, Spence, Jeffrey, Jenkins, Paul, Mathieson, Sara, Song, Yun
An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.