Undirected Networks
Intelligence, physics and information -- the tradeoff between accuracy and simplicity in machine learning
How can we enable machines to make sense of the world, and become better at learning? To approach this goal, I believe viewing intelligence in terms of many integral aspects, and also a universal two-term tradeoff between task performance and complexity, provides two feasible perspectives. In this thesis, I address several key questions in some aspects of intelligence, and study the phase transitions in the two-term tradeoff, using strategies and tools from physics and information. Firstly, how can we make the learning models more flexible and efficient, so that agents can learn quickly with fewer examples? Inspired by how physicists model the world, we introduce a paradigm and an AI Physicist agent for simultaneously learning many small specialized models (theories) and the domain they are accurate, which can then be simplified, unified and stored, facilitating few-shot learning in a continual way. Secondly, for representation learning, when can we learn a good representation, and how does learning depend on the structure of the dataset? We approach this question by studying phase transitions when tuning the tradeoff hyperparameter. In the information bottleneck, we theoretically show that these phase transitions are predictable and reveal structure in the relationships between the data, the model, the learned representation and the loss landscape. Thirdly, how can agents discover causality from observations? We address part of this question by introducing an algorithm that combines prediction and minimizing information from the input, for exploratory causal discovery from observational time series. Fourthly, to make models more robust to label noise, we introduce Rank Pruning, a robust algorithm for classification with noisy labels. I believe that building on the work of my thesis we will be one step closer to enable more intelligent machines that can make sense of the world.
Synergizing Domain Expertise with Self-Awareness in Software Systems: A Patternized Architecture Guideline
Chen, Tao, Bahsoon, Rami, Yao, Xin
Architectural patterns provide a reusable architectural solution for commonly recurring problems that can assist in designing software systems. In this regard, self-awareness architectural patterns are specialized patterns that leverage good engineering practices and experiences to help in designing self-awareness and self-adaptation of a software system. However, domain knowledge and engineers' expertise that is built over time are not explicitly linked to these patterns and the self-aware process. This linkage is important, as it can enrich the design patterns of these systems, which consequently leads to more effective and efficient self-aware and self-adaptive behaviours. This paper is an introductory work that highlights the importance of synergizing domain expertise into the self-awareness in software systems, relying on well-defined underlying approaches. In particular, we present a holistic framework that classifies widely known representations used to obtain and maintain the domain expertise, documenting their nature and specifics rules that permits different levels of synergies with self-awareness. Drawing on such, we describe mechanisms that can enrich existing patterns with engineers' expertise and knowledge of the domain. This, together with the framework, allow us to codify an intuitive step-by-step methodology that guides engineer in making design decisions when synergizing domain expertise into self-awareness and reveal their importances, in an attempt to keep 'engineers-in-the-loop'. Through three case studies, we demonstrate how the enriched patterns, the proposed framework and methodology can be applied in different domains, within which we quantitatively compare the actual benefits of incorporating engineers' expertise into self-awareness, at alternative levels of synergies.
Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines
Manukian, Haik, Pei, Yan Ru, Bearden, Sean R. B., Di Ventra, Massimiliano
Restricted Boltzmann machines (RBMs) are a powerful class of generative models, but their training requires computing a gradient that, unlike supervised backpropagation on typical loss functions, is notoriously difficult even to approximate. Here, we show that properly combining standard gradient updates with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves their training dramatically over traditional gradient methods. This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence). Along with the proofs of stability and convergence of this method, we also demonstrate its efficacy on synthetic datasets where we can compute KL divergences exactly, as well as on a larger machine learning standard, MNIST. The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures such as deep, convolutional and unrestricted Boltzmann machines.
Learning Options from Demonstration using Skill Segmentation
Cockcroft, Matthew, Mawjee, Shahil, James, Steven, Ranchod, Pravesh
We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are learned from these trajectories using the one-class support vector machine clustering algorithm. We demonstrate our method in the four rooms domain, where an agent is able to autonomously discover usable options from human demonstration. Our results show that these inferred options can then be used to improve learning and planning.
Effects of sparse rewards of different magnitudes in the speed of learning of model-based actor critic methods
Vargas, Juan, Andjelic, Lazar, Farimani, Amir Barati
Actor critic methods with sparse rewards in model-based deep reinforcement learning typically require a deterministic binary reward function that reflects only two possible outcomes: if, for each step, the goal has been achieved or not. Our hypothesis is that we can influence an agent to learn faster by applying an external environmental pressure during training, which adversely impacts its ability to get higher rewards. As such, we deviate from the classical paradigm of sparse rewards and add a uniformly sampled reward value to the baseline reward to show that (1) sample efficiency of the training process can be correlated to the adversity experienced during training, (2) it is possible to achieve higher performance in less time and with less resources, (3) we can reduce the performance variability experienced seed over seed, (4) there is a maximum point after which more pressure will not generate better results, and (5) that random positive incentives have an adverse effect when using a negative reward strategy, making an agent under those conditions learn poorly and more slowly. These results have been shown to be valid for Deep Deterministic Policy Gradients using Hindsight Experience Replay in a well known Mujoco environment, but we argue that they could be generalized to other methods and environments as well.
Lipschitz Lifelong Reinforcement Learning
Lecarpentier, Erwan, Abel, David, Asadi, Kavosh, Jinnai, Yuu, Rachelson, Emmanuel, Littman, Michael L.
We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions. Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space. These theoretical results lead us to a value transfer method for Lifelong RL, which we use to build a PAC-MDP algorithm with improved convergence rate. We illustrate the benefits of the method in Lifelong RL experiments.
Optimal by Design: Model-Driven Synthesis of Adaptation Strategies for Autonomous Systems
Elrakaiby, Yehia, Spoletini, Paola, Nuseibeh, Bashar
--Many software systems have become too large and complex to be managed efficiently by human administrators, particularly when they operate in uncertain and dynamic environments and require frequent changes. Requirements-driven adaptation techniques have been proposed to endow systems with the necessary means to autonomously decide ways to satisfy their requirements. However, many current approaches rely on general-purpose languages, models and/or frameworks to design, develop and analyze autonomous systems. Unfortunately, these tools are not tailored towards the characteristics of adaptation problems in autonomous systems. D proposes a model (and a language) for the high-level description of the basic elements of self-adaptive systems, namely the system, capabilities, requirements and environment. Based on those elements, a Markov Decision Process (MDP) is constructed to compute the optimal strategy or the most rewarding system behavior . Furthermore, this defines a reflex controller that can ensure timely responses to changes. One novel feature of the framework is that it benefits both from goal-oriented techniques, developed for requirement elicitation, refinement and analysis, and synthesis capabilities and extensive research around MDPs, their extensions and tools. Our preliminary evaluation results demonstrate the practicality and advantages of the framework. Autonomous systems such as unmanned vehicles and robots play an increasingly relevant role in our societies. Many factors contribute to the complexity in the design and development of those systems. First, they typically operate in dynamic and uncontrollable environments [1]-[5]. Therefore, they must continuously adapt their configuration in response to changes, both in their operating environment and in themselves. Since the frequency of change cannot be controlled, decision-making must be almost instantaneous to ensure timely responses. From a design and management perspective, it is desirable to minimize the effort needed to design the system and to enable its runtime updating and maintenance. A promising technique to address those challenges is requirements-driven adaptation that endow systems with the necessary means to autonomously operate based on their requirements. Requirements are prescriptive statements of intent to be satisfied by cooperation of the agents forming the system [6]. They say what the system will do and not how it will do it [7].
A Critical Look at the Applicability of Markov Logic Networks for Music Signal Analysis
Pauwels, Johan, Fazekas, Gyรถrgy, Sandler, Mark B.
In recent years, Markov logic networks (MLNs) have been proposed as a potentially useful paradigm for music signal analysis. Because all hidden Markov models can be reformulated as MLNs, the latter can provide an all-encompassing framework that reuses and extends previous work in the field. However, just because it is theoretically possible to reformulate previous work as MLNs, does not mean that it is advantageous. In this paper, we analyse some proposed examples of MLNs for musical analysis and consider their practical disadvantages when compared to formulating the same musical dependence relationships as (dynamic) Bayesian networks. We argue that a number of practical hurdles such as the lack of support for sequences and for arbitrary continuous probability distributions make MLNs less than ideal for the proposed musical applications, both in terms of easy of formulation and computational requirements due to their required inference algorithms. These conclusions are not specific to music, but apply to other fields as well, especially when sequential data with continuous observations is involved. Finally, we show that the ideas underlying the proposed examples can be expressed perfectly well in the more commonly used framework of (dynamic) Bayesian networks.
Machine Learning with TensorFlow, Second Edition
About the Technology TensorFlow, Google's library for large-scale machine learning, makes powerful ML techniques easily accessible. It simplifies often-complex computations by representing them as graphs that are mapped to machines in a cluster or to the processors of a single machine. Offering a complete ecosystem for all stages and types of machine learning, TensorFlow's end-to-end functionality empowers machine learning engineers of all skill levels to solve their problems with ML.