Search
A Guide to Genetic 'Learning' Algorithms for Optimization
Genetic algorithms are random, adaptive heuristic search algorithms that act on a population of doable solutions. Genetic algorithms are based on the ideas of natural selection and genetics. New solutions are typically made by'mutating' members of this population, and by'mating' 2 resolutions along to create a replacement solution. The upper solutions are selected to breed and change and so the more severe ones are discarded. They are probabilistic search methods; this implies that the states that they explore are not determined entirely by the properties of the problems.
Importance measures derived from random forests: characterisation and extension
Nowadays new technologies, and especially artificial intelligence, are more and more established in our society. Big data analysis and machine learning, two sub-fields of artificial intelligence, are at the core of many recent breakthroughs in many application fields (e.g., medicine, communication, finance, ...), including some that are strongly related to our day-to-day life (e.g., social networks, computers, smartphones, ...). In machine learning, significant improvements are usually achieved at the price of an increasing computational complexity and thanks to bigger datasets. Currently, cutting-edge models built by the most advanced machine learning algorithms typically became simultaneously very efficient and profitable but also extremely complex. Their complexity is to such an extent that these models are commonly seen as black-boxes providing a prediction or a decision which can not be interpreted or justified. Nevertheless, whether these models are used autonomously or as a simple decision-making support tool, they are already being used in machine learning applications where health and human life are at stake. Therefore, it appears to be an obvious necessity not to blindly believe everything coming out of those models without a detailed understanding of their predictions or decisions. Accordingly, this thesis aims at improving the interpretability of models built by a specific family of machine learning algorithms, the so-called tree-based methods. Several mechanisms have been proposed to interpret these models and we aim along this thesis to improve their understanding, study their properties, and define their limitations.
Maxmin-Fair Ranking: Individual Fairness under Group-Fairness Constraints
Garcia-Soriano, David, Bonchi, Francesco
The bulk of the algorithmic fairness literature deals with group fairness along the lines of demographic parity [9] or equal opportunity We study a novel problem of fairness in ranking aimed at minimizing [16]: this is typically expressed by means of some fairness the amount of individual unfairness introduced when enforcing constraint requiring that the top-positions (for any) in the ranking group-fairness constraints. Our proposal is rooted in the contain enough elements from some groups that are protected distributional maxmin fairness theory, which uses randomization from discrimination based on sex, race, age, etc. In fact, [6] shows to maximize the expected satisfaction of the worst-off individuals.
Minimax Estimation of Partially-Observed Vector AutoRegressions
Dalle, Guillaume, de Castro, Yohann
To understand the behavior of large dynamical systems like transportation networks, one must often rely on measurements transmitted by a set of sensors, for instance individual vehicles. Such measurements are likely to be incomplete and imprecise, which makes it hard to recover the underlying signal of interest.Hoping to quantify this phenomenon, we study the properties of a partially-observed state-space model. In our setting, the latent state $X$ follows a high-dimensional Vector AutoRegressive process $X_t = \theta X_{t-1} + \varepsilon_t$. Meanwhile, the observations $Y$ are given by a noise-corrupted random sample from the state $Y_t = \Pi_t X_t + \eta_t$. Several random sampling mechanisms are studied, allowing us to investigate the effect of spatial and temporal correlations in the distribution of the sampling matrices $\Pi_t$.We first prove a lower bound on the minimax estimation error for the transition matrix $\theta$. We then describe a sparse estimator based on the Dantzig selector and upper bound its non-asymptotic error, showing that it achieves the optimal convergence rate for most of our sampling mechanisms. Numerical experiments on simulated time series validate our theoretical findings, while an application to open railway data highlights the relevance of this model for public transport traffic analysis.
Contrastive Reinforcement Learning of Symbolic Reasoning Domains
Poesia, Gabriel, Dong, WenXin, Goodman, Noah
Abstract symbolic reasoning, as required in domains such as mathematics and logic, is a key component of human intelligence. Solvers for these domains have important applications, especially to computer-assisted education. But learning to solve symbolic problems is challenging for machine learning algorithms. Existing models either learn from human solutions or use hand-engineered features, making them expensive to apply in new domains. In this paper, we instead consider symbolic domains as simple environments where states and actions are given as unstructured text, and binary rewards indicate whether a problem is solved. This flexible setup makes it easy to specify new domains, but search and planning become challenging. We introduce four environments inspired by the Mathematics Common Core Curriculum, and observe that existing Reinforcement Learning baselines perform poorly. We then present a novel learning algorithm, Contrastive Policy Learning (ConPoLe) that explicitly optimizes the InfoNCE loss, which lower bounds the mutual information between the current state and next states that continue on a path to the solution. ConPoLe successfully solves all four domains. Moreover, problem representations learned by ConPoLe enable accurate prediction of the categories of problems in a real mathematics curriculum. Our results suggest new directions for reinforcement learning in symbolic domains, as well as applications to mathematics education.
Probabilistic DAG Search
Grosse, Julia, Zhang, Cheng, Hennig, Philipp
Exciting contemporary machine learning problems have recently been phrased in the classic formalism of tree search -- most famously, the game of Go. Interestingly, the state-space underlying these sequential decision-making problems often posses a more general latent structure than can be captured by a tree. In this work, we develop a probabilistic framework to exploit a search space's latent structure and thereby share information across the search tree. The method is based on a combination of approximate inference in jointly Gaussian models for the explored part of the problem, and an abstraction for the unexplored part that imposes a reduction of complexity ad hoc. We empirically find our algorithm to compare favorably to existing non-probabilistic alternatives in Tic-Tac-Toe and a feature selection application.
Efficient Data-specific Model Search for Collaborative Filtering
Gao, Chen, Yao, Quanming, Jin, Depeng, Li, Yong
Collaborative filtering (CF), as a fundamental approach for recommender systems, is usually built on the latent factor model with learnable parameters to predict users' preferences towards items. However, designing a proper CF model for a given data is not easy, since the properties of datasets are highly diverse. In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model by AutoML techniques. The key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction function, and prediction function. We further develop an easy-to-use, robust, and efficient search strategy, which utilizes random search and a performance predictor for efficient searching within the above framework. In this way, we can combinatorially generalize data-specific CF models, which have not been visited in the literature, from SOTA ones. Extensive experiments on five real-world datasets demonstrate that our method can consistently outperform SOTA ones for various CF tasks. Further experiments verify the rationality of the proposed framework and the efficiency of the search strategy. The searched CF models can also provide insights for exploring more effective methods in the future
Planning Spatial Networks
Darvariu, Victor-Alexandru, Hailes, Stephen, Musolesi, Mirco
We tackle the problem of goal-directed graph construction: given a starting graph, a global objective function (e.g., communication efficiency), and a budget of modifications, the aim is to find a set of edges whose addition to the graph maximally improves the objective. This problem emerges in many networks of great importance for society such as transportation and critical infrastructure networks. We identify two significant shortcomings with present methods. Firstly, they focus exclusively on network topology while ignoring spatial information; however, in many real-world networks, nodes are embedded in space, which yields different global objectives and governs the range and density of realizable connections. Secondly, existing RL methods scale poorly to large networks due to the high cost of training a model and the scaling factors of the action space and global objectives. In this work, we formulate the problem of goal-directed construction of spatial networks as a deterministic MDP. We adopt the Monte Carlo Tree Search framework for planning in this domain, prioritizing the optimality of final solutions over the speed of policy evaluation. We propose several improvements over the standard UCT algorithm for this family of problems, addressing their single-agent nature, the trade-off between the costs of edges and their contribution to the objective, and an action space linear in the number of nodes. We demonstrate the suitability of this approach for improving the global efficiency and attack resilience of a variety of synthetic and real-world networks, including Internet backbone networks and metro systems. We obtain 24% better solutions on average compared to UCT on the largest networks tested, and scalability superior to previous methods.
Zero-Cost Proxies Meet Differentiable Architecture Search
Xiang, Lichuan, Dudziak, Łukasz, Abdelfattah, Mohamed S., Chau, Thomas, Lane, Nicholas D., Wen, Hongkai
Differentiable neural architecture search (NAS) has attracted significant attention in recent years due to its ability to quickly discover promising architectures of deep neural networks even in very large search spaces. Despite its success, DARTS lacks robustness in certain cases, e.g. it may degenerate to trivial architectures with excessive parametric-free operations such as skip connection or random noise, leading to inferior performance. In particular, operation selection based on the magnitude of architectural parameters was recently proven to be fundamentally wrong showcasing the need to rethink this aspect. On the other hand, zero-cost proxies have been recently studied in the context of sample-based NAS showing promising results -- speeding up the search process drastically in some cases but also failing on some of the large search spaces typical for differentiable NAS. In this work we propose a novel operation selection paradigm in the context of differentiable NAS which utilises zero-cost proxies. Our perturbation-based zero-cost operation selection (Zero-Cost-PT) improves searching time and, in many cases, accuracy compared to the best available differentiable architecture search, regardless of the search space size. Specifically, we are able to find comparable architectures to DARTS-PT on the DARTS CNN search space while being over 40x faster (total searching time 25 minutes on a single GPU).
Solving Graph-based Public Good Games with Tree Search and Imitation Learning
Darvariu, Victor-Alexandru, Hailes, Stephen, Musolesi, Mirco
Public goods games represent insightful settings for studying incentives for individual agents to make contributions that, while costly for each of them, benefit the wider society. In this work, we adopt the perspective of a central planner with a global view of a network of self-interested agents and the goal of maximizing some desired property in the context of a best-shot public goods game. Existing algorithms for this known NP-complete problem find solutions that are sub-optimal and cannot optimize for criteria other than social welfare. In order to efficiently solve public goods games, our proposed method directly exploits the correspondence between equilibria and the Maximal Independent Set (mIS) structural property of graphs. In particular, we define a Markov Decision Process, which incrementally generates an mIS, and adopt a planning method to search for equilibria, outperforming existing methods. Furthermore, we devise an imitation learning technique that uses demonstrations of the search to obtain a graph neural network parametrized policy which quickly generalizes to unseen game instances. Our evaluation results show that this policy is able to reach 99.5% of the performance of the planning method while being approximately three orders of magnitude faster to evaluate on the largest graphs tested. The methods presented in this work can be applied to a large class of public goods games of potentially high societal impact.