Goto

Collaborating Authors

 Search


This AI can explain how it solves Rubik's Cube--and that's a big deal

#artificialintelligence

However, these AI algorithms cannot explain the thought processes behind their decisions. A computer that masters protein folding and also tells researchers more about the rules of biology is much more useful than a computer that folds proteins without explanation. Therefore, AI researchers like me are now turning our efforts toward developing AI algorithms that can explain themselves in a manner that humans can understand. If we can do this, I believe that AI will be able to uncover and teach people new facts about the world that have not yet been discovered, leading to new innovations. One field of AI, called reinforcement learning, studies how computers can learn from their own experiences.


Feature Selection Using Reinforcement Learning

arXiv.org Machine Learning

With the decreasing cost of data collection, the space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Therefore, identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model. In addition, identifying such features is critical for interpretability, prediction accuracy and optimal computation cost. While statistical methods such as subset selection, shrinkage, dimensionality reduction have been applied in selecting the best set of features, some other approaches in literature have approached feature selection task as a search problem where each state in the search space is a possible feature subset. In this paper, we solved the feature selection problem using Reinforcement Learning. Formulating the state space as a Markov Decision Process (MDP), we used Temporal Difference (TD) algorithm to select the best subset of features. Each state was evaluated using a robust and low cost classifier algorithm which could handle any non-linearities in the dataset.


Noisy intermediate-scale quantum (NISQ) algorithms

arXiv.org Artificial Intelligence

A universal fault-tolerant quantum computer that can solve efficiently problems such as integer factorization and unstructured database search requires millions of qubits with low error rates and long coherence times. While the experimental advancement towards realizing such devices will potentially take decades of research, noisy intermediate-scale quantum (NISQ) computers already exist. These computers are composed of hundreds of noisy qubits, i.e. qubits that are not error-corrected, and therefore perform imperfect operations in a limited coherence time. In the search for quantum advantage with these devices, algorithms have been proposed for applications in various disciplines spanning physics, machine learning, quantum chemistry and combinatorial optimization. The goal of such algorithms is to leverage the limited available resources to perform classically challenging tasks. In this review, we provide a thorough summary of NISQ computational paradigms and algorithms. We discuss the key structure of these algorithms, their limitations, and advantages. We additionally provide a comprehensive overview of various benchmarking and software tools useful for programming and testing NISQ devices.


Zero-Cost Proxies for Lightweight NAS

arXiv.org Artificial Intelligence

Neural Architecture Search (NAS) is quickly becoming the standard methodology to design neural network models. However, NAS is typically compute-intensive because multiple models need to be evaluated before choosing the best one. To reduce the computational power and time needed, a proxy task is often used for evaluating each model instead of full training. In this paper, we evaluate conventional reduced-training proxies and quantify how well they preserve ranking between multiple models during search when compared with the rankings produced by final trained accuracy. We propose a series of zero-cost proxies, based on recent pruning literature, that use just a single minibatch of training data to compute a model's score. Our zero-cost proxies use 3 orders of magnitude less computation but can match and even outperform conventional proxies. For example, Spearman's rank correlation coefficient between final validation accuracy and our best zero-cost proxy on NAS-Bench-201 is 0.82, compared to 0.61 for EcoNAS (a recently proposed reduced-training proxy). Finally, we use these zero-cost proxies to enhance existing NAS search algorithms such as random search, reinforcement learning, evolutionary search and predictor-based search. For all search methodologies and across three different NAS datasets, we are able to significantly improve sample efficiency, and thereby decrease computation, by using our zero-cost proxies. For example on NAS-Bench-101, we achieved the same accuracy 4$\times$ quicker than the best previous result.


A Survey on the Explainability of Supervised Machine Learning

Journal of Artificial Intelligence Research

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or finance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.


Minimax Off-Policy Evaluation for Multi-Armed Bandits

arXiv.org Machine Learning

We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards, and develop minimax rate-optimal procedures under three settings. First, when the behavior policy is known, we show that the Switch estimator, a method that alternates between the plug-in and importance sampling estimators, is minimax rate-optimal for all sample sizes. Second, when the behavior policy is unknown, we analyze performance in terms of the competitive ratio, thereby revealing a fundamental gap between the settings of known and unknown behavior policies. When the behavior policy is unknown, any estimator must have mean-squared error larger -- relative to the oracle estimator equipped with the knowledge of the behavior policy -- by a multiplicative factor proportional to the support size of the target policy. Moreover, we demonstrate that the plug-in approach achieves this worst-case competitive ratio up to a logarithmic factor. Third, we initiate the study of the partial knowledge setting in which it is assumed that the minimum probability taken by the behavior policy is known. We show that the plug-in estimator is optimal for relatively large values of the minimum probability, but is sub-optimal when the minimum probability is low. In order to remedy this gap, we propose a new estimator based on approximation by Chebyshev polynomials that provably achieves the optimal estimation error. Numerical experiments on both simulated and real data corroborate our theoretical findings.


Performance analysis of greedy algorithms for minimising a Maximum Mean Discrepancy

arXiv.org Machine Learning

We analyse the performance of several iterative algorithms for the quantisation of a probability measure $\mu$, based on the minimisation of a Maximum Mean Discrepancy (MMD). Our analysis includes kernel herding, greedy MMD minimisation and Sequential Bayesian Quadrature (SBQ). We show that the finite-sample-size approximation error, measured by the MMD, decreases as $1/n$ for SBQ and also for kernel herding and greedy MMD minimisation when using a suitable step-size sequence. The upper bound on the approximation error is slightly better for SBQ, but the other methods are significantly faster, with a computational cost that increases only linearly with the number of points selected. This is illustrated by two numerical examples, with the target measure $\mu$ being uniform (a space-filling design application) and with $\mu$ a Gaussian mixture.


Player-AI Interaction: What Neural Network Games Reveal About AI as Play

arXiv.org Artificial Intelligence

The advent of artificial intelligence (AI) and machine learning (ML) bring human-AI interaction to the forefront of HCI research. This paper argues that games are an ideal domain for studying and experimenting with how humans interact with AI. Through a systematic survey of neural network games (n = 38), we identified the dominant interaction metaphors and AI interaction patterns in these games. In addition, we applied existing human-AI interaction guidelines to further shed light on player-AI interaction in the context of AI-infused systems. Our core finding is that AI as play can expand current notions of human-AI interaction, which are predominantly productivity-based. In particular, our work suggests that game and UX designers should consider flow to structure the learning curve of human-AI interaction, incorporate discovery-based learning to play around with the AI and observe the consequences, and offer users an invitation to play to explore new forms of human-AI interaction.


Hierarchical Width-Based Planning and Learning

arXiv.org Artificial Intelligence

Width-based search methods have demonstrated state-of-the-art performance in a wide range of testbeds, from classical planning problems to image-based simulators such as Atari games. These methods scale independently of the size of the state-space, but exponentially in the problem width. In practice, running the algorithm with a width larger than 1 is computationally intractable, prohibiting IW from solving higher width problems. In this paper, we present a hierarchical algorithm that plans at two levels of abstraction. A high-level planner uses abstract features that are incrementally discovered from low-level pruning decisions. We illustrate this algorithm in classical planning PDDL domains as well as in pixel-based simulator domains. In classical planning, we show how IW(1) at two levels of abstraction can solve problems of width 2. For pixel-based domains, we show how in combination with a learned policy and a learned value function, the proposed hierarchical IW can outperform current flat IW-based planners in Atari games with sparse rewards.


Cost-optimal Planning, Delete Relaxation, Approximability, and Heuristics

Journal of Artificial Intelligence Research

Cost-optimal planning is a very well-studied topic within planning, and it has proven to be computationally hard both in theory and in practice. Since cost-optimal planning is an optimisation problem, it is natural to analyse it through the lens of approximation. An important reason for studying cost-optimal planning is heuristic search; heuristic functions that guide the search in planning can often be viewed as algorithms solving or approximating certain optimisation problems. Many heuristic functions (such as the ubiquitious h+ heuristic) are based on delete relaxation, which ignores negative effects of actions. Planning for instances where the actions have no negative effects is often referred to as monotone planning. The aim of this article is to analyse the approximability of cost-optimal monotone planning, and thus the performance of relevant heuristic functions. Our findings imply that it may be beneficial to study these kind of problems within the framework of parameterised complexity and we initiate work in this direction.