Goto

Collaborating Authors

 Undirected Networks


Virtuously Safe Reinforcement Learning

arXiv.org Artificial Intelligence

We show that when a third party, the adversary, steps into the two-party setting (agent and operator) of safely interruptible reinforcement learning, a trade-off has to be made between the probability of following the optimal policy in the limit, and the probability of escaping a dangerous situation created by the adversary. So far, the work on safely interruptible agents has assumed a perfect perception of the agent about its environment (no adversary), and therefore implicitly set the second probability to zero, by explicitly seeking a value of one for the first probability. We show that (1) agents can be made both interruptible and adversary-resilient, and (2) the interruptibility can be made safe in the sense that the agent itself will not seek to avoid it. We also solve the problem that arises when the agent does not go completely greedy, i.e. issues with safe exploration in the limit. Resilience to perturbed perception, safe exploration in the limit, and safe interruptibility are the three pillars of what we call \emph{virtuously safe reinforcement learning}.


Propositional Knowledge Representation and Reasoning in Restricted Boltzmann Machines

arXiv.org Artificial Intelligence

While knowledge representation and reasoning are considered the keys for human-level artificial intelligence, connectionist networks have been shown successful in a broad range of applications due to their capacity for robust learning and flexible inference under uncertainty. The idea of representing symbolic knowledge in connectionist networks has been well-received and attracted much attention from research community as this can establish a foundation for integration of scalable learning and sound reasoning. In previous work, there exist a number of approaches that map logical inference rules with feed-forward propagation of artificial neural networks (ANN). However, the discriminative structure of an ANN requires the separation of input/output variables which makes it difficult for general reasoning where any variables should be inferable. Other approaches address this issue by employing generative models such as symmetric connectionist networks, however, they are difficult and convoluted. In this paper we propose a novel method to represent propositional formulas in restricted Boltzmann machines which is less complex, especially in the cases of logical implications and Horn clauses. An integration system is then developed and evaluated in real datasets which shows promising results.


Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting

arXiv.org Artificial Intelligence

Inspired by how humans summarize long documents, we propose an accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively (i.e., compresses and paraphrases) to generate a concise overall summary. We use a novel sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, while maintaining language fluency. Empirically, we achieve the new state-of-the-art on all metrics (including human evaluation) on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores. Moreover, by first operating at the sentence-level and then the word-level, we enable parallel decoding of our neural generative model that results in substantially faster (10-20x) inference speed as well as 4x faster training convergence than previous long-paragraph encoder-decoder models. We also demonstrate the generalization of our model on the test-only DUC-2002 dataset, where we achieve higher scores than a state-of-the-art model.


Training Medical Image Analysis Systems like Radiologists

arXiv.org Artificial Intelligence

The training of medical image analysis systems using machine learning approaches follows a common script: collect and annotate a large dataset, train the classifier on the training set, and test it on a holdout test set. This process bears no direct resemblance with radiologist training, which is based on solving a series of tasks of increasing difficulty, where each task involves the use of significantly smaller datasets than those used in machine learning. In this paper, we propose a novel training approach inspired by how radiologists are trained. In particular, we explore the use of meta-training that models a classifier based on a series of tasks. Tasks are selected using teacher-student curriculum learning, where each task consists of simple classification problems containing small training sets. We hypothesize that our proposed meta-training approach can be used to pre-train medical image analysis models. This hypothesis is tested on the automatic breast screening classification from DCE-MRI trained with weakly labeled datasets. The classification performance achieved by our approach is shown to be the best in the field for that application, compared to state of art baseline approaches: DenseNet, multiple - instance learning and multi-task learning.


GAN Q-learning

arXiv.org Machine Learning

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However there are many different ways in which one can leverage the distributional approach to reinforcement learning. In this paper, we propose GAN Q-learning, a novel distributional RL method based on generative adversarial networks (GANs) and analyze its performance in simple tabular environments, as well as OpenAI Gym. We empirically show that our algorithm leverages the flexibility and blackbox approach of deep learning models while providing a viable alternative to traditional methods.


Contextual Graph Markov Model: A Deep and Generative Approach to Graph Processing

arXiv.org Artificial Intelligence

We introduce the Contextual Graph Markov Model, an approach combining ideas from generative models and neural networks for the processing of graph data. It founds on a constructive methodology to build a deep architecture comprising layers of probabilistic models that learn to encode the structured information in an incremental fashion. Context is diffused in an efficient and scalable way across the graph vertexes and edges. The resulting graph encoding is used in combination with discriminative models to address structure classification benchmarks.


Classification-Based Machine Learning for Finance

@machinelearnbot

Finally, a comprehensive hands-on machine learning course with specific focus on classification based models for the investment community and passionate investors. In the past few years, there has been a massive adoption and growth in the use of data science, artificial intelligence and machine learning to find alpha. However, information on and application of machine learning to investment are scarce. This course has been designed to address that. It is meant to spark your creative juices and get you started in this space.


Non-Technical Person's Guide To Entering The Machine Learning Industry

#artificialintelligence

As the buzz around data science grows every day, there is a slew of self-taught professionals who have kick-started the machine learning journey with Andrew Ng's online courses. Many enthusiasts are gravitating towards the computer science field. But if one wants to pursue a career in Machine Learning, they need to be familiar with statistics and linear algebra. With computer science and ML applications becoming more pervasive in everyday life, people from a non-technical background are also interested in joining the field. In this article, we have discussed in-depth roles a person from non-tech background can explore in the data science/AI field.


Face Recognition for Beginners – Towards Data Science

#artificialintelligence

Face Recognition is a recognition technique used to detect faces of individuals whose images saved in the data set. Despite the point that other methods of identification can be more accurate, face recognition has always remained a significant focus of research because of its non-meddling nature and because it is people's facile method of personal identification. Face recognition algorithms classified as geometry based or template based algorithms. The template-based methods can be constructed using statistical tools like SVM [Support Vector Machines], PCA [Principal Component Analysis], LDA [Linear Discriminant Analysis], Kernel methods or Trace Transforms. The geometric feature based methods analyse local facial features and their geometric relationship.


Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

arXiv.org Artificial Intelligence

Policy evaluation, commonly referred to as value function approximation, is an important and central part in many reinforcement learning (RL) algorithms [27], whose task is to estimate value functions for a fixed policy in a discounted Markov Decision Process (MDP) environment. The value function of each state specifies the accumulated reward an agent would receive in the future by following the fixed policy from that state. Value functions have been widely investigated in RL applications, and it can provide insightful and important information for the agent to obtain an optimal policy, such as important board configurations in Go [24], failure probabilities of large telecommunication networks [9], taxi-out times at large airports [2] and so on. Despite the value functions can be approximated by different ways, the simplest form, linear approximations, are still widely adopted and studied due to their good generalization abilities, relatively efficient computation and solid theoretical guarantees[27, 7, 13, 16]. Temporal Difference (TD) learning is a common approach to this policy evaluation with linear function approximation problem[27]. These typical TD algorithms can be divided into two categories: gradient based methods (e.g., GTD(λ) [28]) and least-square (LS) based methods (e.g., LSTD(λ)[4]). A good survey on these algorithms can be found in [17, 6, 12, 7, 13]. 1 As the development of information technologies, high-dimensional data is widely seen in RL applications [26, 30, 23], which brings serious challenges to design scalable and computationally efficient algorithms for the linear value function approximation problem. To address this practical issue, several approaches have been developed for efficient and effective value function approximation.