Country
Domain-independent Dominance of Adaptive Methods
Savarese, Pedro, McAllester, David, Babu, Sudarshan, Maire, Michael
From a simplified analysis of adaptive methods, we derive AvaGrad, a new optimizer which outperforms SGD on vision tasks when its adaptability is properly tuned. We observe that the power of our method is partially explained by a decoupling of learning rate and adaptability, greatly simplifying hyperparameter search. In light of this observation, we demonstrate that, against conventional wisdom, Adam can also outperform SGD on vision tasks, as long as the coupling between its learning rate and adaptability is taken into account. In practice, AvaGrad matches the best results, as measured by generalization accuracy, delivered by any existing optimizer (SGD or adaptive) across image classification (CIFAR, ImageNet) and character-level language modelling (Penn Treebank) tasks. This later observation, alongside of AvaGrad's decoupling of hyperparameters, could make it the preferred optimizer for deep learning, replacing both SGD and Adam.
Blockchain Intelligence: When Blockchain Meets Artificial Intelligence
Blockchain is gaining extensive attention due to its provision of secure and decentralized resource sharing manner. However, the incumbent blockchain systems also suffer from a number of challenges in operational maintenance, quality assurance of smart contracts and malicious behaviour detection of blockchain data. The recent advances in artificial intelligence bring the opportunities in overcoming the above challenges. The integration of blockchain with artificial intelligence can be beneficial to enhance current blockchain systems. This article presents an introduction of the convergence of blockchain and artificial intelligence (namely blockchain intelligence). This article also gives a case study to further demonstrate the feasibility of blockchain intelligence and point out the future directions.
Measuring the Reliability of Reinforcement Learning Algorithms
Chan, Stephanie C. Y., Fishman, Sam, Canny, John, Korattikara, Anoop, Guadarrama, Sergio
Lack of reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production users with the evaluation and improvement of reliability, we propose a set of metrics that quantitatively measure different aspects of reliability. In this work, we focus on variability and risk, both during training and after learning (on a fixed policy). We designed these metrics to be general-purpose, and we also designed complementary statistical tests to enable rigorous comparisons on these metrics. In this paper, we first describe the desired properties of the metrics and their design, the aspects of reliability that they measure, and their applicability to different scenarios. We then describe the statistical tests and make additional practical recommendations for reporting results. The metrics and accompanying statistical tools have been made available as an open-source library, here: https://github.com/google-research/rl-reliability-metrics . We apply our metrics to a set of common RL algorithms and environments, compare them, and analyze the results.
Efficient and Robust Reinforcement Learning with Uncertainty-based Value Expansion
Zhou, Bo, Zeng, Hongsheng, Wang, Fan, Li, Yunxiang, Tian, Hao
By integrating dynamics models into model-free reinforcement learning (RL) methods, model-based value expansion (MVE) algorithms have shown a significant advantage in sample efficiency as well as value estimation. However, these methods suffer from higher function approximation errors than model-free methods in stochastic environments due to a lack of modeling the environmental randomness. As a result, their performance lags behind the best model-free algorithms in some challenging scenarios. In this paper, we propose a novel Hybrid-RL method that builds on MVE, namely the Risk Averse Value Expansion (RAVE). With imaginative rollouts generated by an ensemble of probabilistic dynamics models, we further introduce the aversion of risks by seeking the lower confidence bound of the estimation. Experiments on a range of challenging environments show that by modeling the uncertainty completely, RAVE substantially enhances the robustness of previous model-based methods, and yields state-of-the-art performance. With this technique, our solution gets the first place in NeurIPS 2019: Learn to Move.
Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
Islam, Riashat, Seraj, Raihan, Bacon, Pierre-Luc, Precup, Doina
The policy gradient theorem is defined based on an objective with respect to the initial distribution over states. In the discounted case, this results in policies that are optimal for one distribution over initial states, but may not be uniformly optimal for others, no matter where the agent starts from. Furthermore, to obtain unbiased gradient estimates, the starting point of the policy gradient estimator requires sampling states from a normalized discounted weighting of states. However, the difficulty of estimating the normalized discounted weighting of states, or the stationary state distribution, is quite well-known. Additionally, the large sample complexity of policy gradient methods is often attributed to insufficient exploration, and to remedy this, it is often assumed that the restart distribution provides sufficient exploration in these algorithms. In this work, we propose exploration in policy gradient methods based on maximizing entropy of the discounted future state distribution. The key contribution of our work includes providing a practically feasible algorithm to estimate the normalized discounted weighting of states, i.e, the \textit{discounted future state distribution}. We propose that exploration can be achieved by entropy regularization with the discounted state distribution in policy gradients, where a metric for maximal coverage of the state space can be based on the entropy of the induced state distribution. The proposed approach can be considered as a three time-scale algorithm and under some mild technical conditions, we prove its convergence to a locally optimal policy. Experimentally, we demonstrate usefulness of regularization with the discounted future state distribution in terms of increased state space coverage and faster learning on a range of complex tasks.
Explainability Fact Sheets: A Framework for Systematic Assessment of Explainable Approaches
Explanations in Machine Learning come in many forms, but a consensus regarding their desired properties is yet to emerge. In this paper we introduce a taxonomy and a set of descriptors that can be used to characterise and systematically assess explainable systems along five key dimensions: functional, operational, usability, safety and validation. In order to design a comprehensive and representative taxonomy and associated descriptors we surveyed the eXplainable Artificial Intelligence literature, extracting the criteria and desiderata that other authors have proposed or implicitly used in their research. The survey includes papers introducing new explainability algorithms to see what criteria are used to guide their development and how these algorithms are evaluated, as well as papers proposing such criteria from both computer science and social science perspectives. This novel framework allows to systematically compare and contrast explainability approaches, not just to better understand their capabilities but also to identify discrepancies between their theoretical qualities and properties of their implementations. We developed an operationalisation of the framework in the form of Explainability Fact Sheets, which enable researchers and practitioners alike to quickly grasp capabilities and limitations of a particular explainable method. When used as a Work Sheet, our taxonomy can guide the development of new explainability approaches by aiding in their critical evaluation along the five proposed dimensions.
Completion Reasoning Emulation for the Description Logic EL+
Eberhart, Aaron, Ebrahimi, Monireh, Zhou, Lu, Shimizu, Cogan, Hitzler, Pascal
We present a new approach to integrating deep learning with knowledge-based systems that we believe shows promise. Our approach seeks to emulate reasoning structure, which can be inspected part-way through, rather than simply learning reasoner answers, which is typical in many of the black-box systems currently in use. We demonstrate that this idea is feasible by training a long short-term memory (LSTM) artificial neural network to learn EL+ reasoning patterns with two different data sets. We also show that this trained system is resistant to noise by corrupting a percentage of the test data and comparing the reasoner's and LSTM's predictions on corrupt data with correct answers.
Fuzzy Rule Interpolation Toolbox for the GNU Open-Source OCTAVE
Alzubi, Maen, Almseidin, Mohammad, Lone, Mohd Aaqib, Kovacs, Szilveszter
In most fuzzy control applications (applying classical fuzzy reasoning), the reasoning method requires a complete fuzzy rule-base, i.e all the possible observations must be covered by the antecedents of the fuzzy rules, which is not always available. Fuzzy control systems based on the Fuzzy Rule Interpolation (FRI) concept play a major role in different platforms, in case if only a sparse fuzzy rule-base is available. This cases the fuzzy model contains only the most relevant rules, without covering all the antecedent universes. The first FRI toolbox being able to handle different FRI methods was developed by Johanyak et. al. in 2006 for the MATLAB environment. The goal of this paper is to introduce some details of the adaptation of the FRI toolbox to support the GNU/OCTAVE programming language. The OCTAVE Fuzzy Rule Interpolation (OCTFRI) Toolbox is an open-source toolbox for OCTAVE programming language, providing a large functionally compatible subset of the MATLAB FRI toolbox as well as many extensions. The OCTFRI Toolbox includes functions that enable the user to evaluate Fuzzy Inference Systems (FISs) from the command line and from OCTAVE scripts, read/write FISs and OBS to/from files, and produce a graphical visualisation of both the membership functions and the FIS outputs. Future work will focus on implementing advanced fuzzy inference techniques and GUI tools.
Datamorphic Testing: A Methodology for Testing AI Applications
Zhu, Hong, Liu, Dongmei, Bayley, Ian, Harrison, Rachel, Cuzzolin, Fabio
With the rapid growth of the applications of machine learning (ML) and other artificial intelligence (AI) techniques, adequate testing has become a necessity to ensure their quality. This paper identifies the characteristics of AI applications that distinguish them from traditional software, and analyses the main difficulties in applying existing testing methods. Based on this analysis, we propose a new method called datamorphic testing and illustrate the method with an example of testing face recognition applications. We also report an experiment with four real industrial application systems of face recognition to validate the proposed approach.
Qualitative Numeric Planning: Reductions and Complexity
Qualitative numerical planning is classical planning extended with nonnegative real variables that can be increased or decreased "qualitatively", i.e., by positive r andom amounts. While deterministic planning with numerical variables is undecidable in general, qualit ative numerical planning is decidable and provides a convenient abstract model for generaliz ed planning. Qualitative numerical planning, introduced by Srivastava, Zilberstein, Immerman, an d Geffner (2011), showed that solutions to qualitative numerical problems (QNPs) correspond to t he strong cyclic solutions of an associated fully observable non-deterministic (FOND) problem that terminate. The approach leads to a generate-and-test algorithm for solving QNPs where solutions to a FOND problem are generated one by one and tested for termination. The computational shortcomings of this approach, however, are that it is not simple to amend FOND planners to generat e all solutions, and that the number of solutions to check can be doubly exponential in the nu mber of variables. In this work we address these limitations, while providing additional insights o n QNPs. More precisely, we introduce two reductions, one from QNPs to FOND problems and the other from FOND problems to QNPs both of which do not involve termination tests. A result of th ese reductions is that QNPs are shown to have the same expressive power and the same complex ity as FOND problems.