Goto

Collaborating Authors

 Reinforcement Learning


The Future of AI Part 1

#artificialintelligence

It was reported that Venture Capital investments into AI related startups made a significant increase in 2018, jumping by 72% compared to 2017, with 466 startups funded from 533 in 2017. PWC moneytree report stated that that seed-stage deal activity in the US among AI-related companies rose to 28% in the fourth-quarter of 2018, compared to 24% in the three months prior, while expansion-stage deal activity jumped to 32%, from 23%. There will be an increasing international rivalry over the global leadership of AI. President Putin of Russia was quoted as saying that "the nation that leads in AI will be the ruler of the world". Billionaire Mark Cuban was reported in CNBC as stating that "the world's first trillionaire would be an AI entrepreneur".


Reinforcement Learning-based N-ary Cross-Sentence Relation Extraction

arXiv.org Machine Learning

The models of n-ary cross sentence relation extraction based on distant supervision assume that consecutive sentences mentioning n entities describe the relation of these n entities. However, on one hand, this assumption introduces noisy labeled data and harms the models' performance. On the other hand, some non-consecutive sentences also describe one relation and these sentences cannot be labeled under this assumption. In this paper, we relax this strong assumption by a weaker distant supervision assumption to address the second issue and propose a novel sentence distribution estimator model to address the first problem. This estimator selects correctly labeled sentences to alleviate the effect of noisy data is a two-level agent reinforcement learning model. In addition, a novel universal relation extractor with a hybrid approach of attention mechanism and PCNN is proposed such that it can be deployed in any tasks, including consecutive and nonconsecutive sentences. Experiments demonstrate that the proposed model can reduce the impact of noisy data and achieve better performance on general n-ary cross sentence relation extraction task compared to baseline models.


Complementary Meta-Reinforcement Learning for Fault-Adaptive Control

arXiv.org Machine Learning

Faults are endemic to all systems. Adaptive fault-tolerant control maintains degraded performance when faults occur as opposed to unsafe conditions or catastrophic events. In systems with abrupt faults and strict time constraints, it is imperative for control to adapt quickly to system changes to maintain system operations. We present a meta-reinforcement learning approach that quickly adapts its control policy to changing conditions. The approach builds upon model-agnostic meta learning (MAML). The controller maintains a complement of prior policies learned under system faults. This "library" is evaluated on a system after a new fault to initialize the new policy. This contrasts with MAML, where the controller derives intermediate policies anew, sampled from a distribution of similar systems, to initialize a new policy. Our approach improves sample efficiency of the reinforcement learning process. We evaluate our approach on an aircraft fuel transfer system under abrupt faults.


Lineage Evolution Reinforcement Learning

arXiv.org Artificial Intelligence

We propose a general agent population learning system, and on this basis, we propose lineage evolution reinforcement learning algorithm. Lineage evolution reinforcement learning is a kind of derivative algorithm which accords with the general agent population learning system. We take the agents in DQN and its related variants as the basic agents in the population, and add the selection, mutation and crossover modules in the genetic algorithm to the reinforcement learning algorithm. In the process of agent evolution, we refer to the characteristics of natural genetic behavior, add lineage factor to ensure the retention of potential performance of agent, and comprehensively consider the current performance and lineage value when evaluating the performance of agent. Without changing the parameters of the original reinforcement learning algorithm, lineage evolution reinforcement learning can optimize different reinforcement learning algorithms. Our experiments show that the idea of evolution with lineage improves the performance of original reinforcement learning algorithm in some games in Atari 2600.


Graph neural induction of value iteration

arXiv.org Artificial Intelligence

Many reinforcement learning tasks can benefit from explicit planning based on an internal model of the environment. Previously, such planning components have been incorporated through a neural network that partially aligns with the computational graph of value iteration. Such network have so far been focused on restrictive environments (e.g. grid-worlds), and modelled the planning procedure only indirectly. We relax these constraints, proposing a graph neural network (GNN) that executes the value iteration (VI) algorithm, across arbitrary environment models, with direct supervision on the intermediate steps of VI. The results indicate that GNNs are able to model value iteration accurately, recovering favourable metrics and policies across a variety of out-of-distribution tests. This suggests that GNN executors with strong supervision are a viable component within deep reinforcement learning systems.


Using deep learning to control the unconsciousness level of patients in an anesthetic state

#artificialintelligence

In recent years, researchers have been developing machine learning algorithms for an increasingly wide range of purposes. This includes algorithms that can be applied in healthcare settings, for instance helping clinicians to diagnose specific diseases or neuropsychiatric disorders or monitor the health of patients over time. Researchers at Massachusetts Institute of Technology (MIT) and Massachusetts General Hospital have recently carried out a study investigating the possibility of using deep reinforcement learning to control the levels of unconsciousness of patients who require anesthesia for a medical procedure. Their paper, set to be published in the proceedings of the 2020 International Conference on Artificial Intelligence in Medicine, was voted the best paper presented at the conference. "Our lab has made significant progress in understanding how anesthetic medications affect neural activity and now has a multidisciplinary team studying how to accurately determine anesthetic doses from neural recordings," Gabriel Schamberg, one of the researchers who carried out the study, told TechXplore.


A robot triumphs in a curling match against elite humans

#artificialintelligence

A robot equipped with artificial intelligence (AI) can excel at the Olympic sport of curling -- and even beat top-level human teams. Success requires precision and strategy, but the game is less complex than other real-world applications of robotics. That makes curling a useful test case for AI technologies, which often perform well in simulations but falter in real-world scenarios with changing conditions. Using a method called adaptive deep reinforcement learning, Seong-Whan Lee and his colleagues at Korea University in Seoul created an algorithm that learns through trial and error to adjust a robot's throws to account for changing conditions, such as the ice surface and the positions of stones. The team's robot, nicknamed Curly, needed a few test throws to calibrate itself to the curling rink where it was to compete.


Probabilistic Machine Learning for Healthcare

#artificialintelligence

Machine learning can be used to make sense of healthcare data. Probabilistic machine learning models help provide a complete picture of observed data in healthcare. In this review, we examine how probabilistic machine learning can advance healthcare. We consider challenges in the predictive model building pipeline where probabilistic models can be beneficial including calibration and missing data. Beyond predictive models, we also investigate the utility of probabilistic machine learning models in phenotyping, in generative models for clinical use cases, and in reinforcement learning.


Symbolic Relational Deep Reinforcement Learning based on Graph Neural Networks

arXiv.org Artificial Intelligence

We present a novel deep reinforcement learning framework for solving relational problems. The method operates with a symbolic representation of objects, their relations and multi-parameter actions, where the objects are the parameters. Our framework, based on graph neural networks, is completely domain-independent and can be applied to any relational problem with existing symbolic-relational representation. We show how to represent relational states with arbitrary goals, multi-parameter actions and concurrent actions. We evaluate the method on a set of three domains: BlockWorld, Sokoban and SysAdmin. The method displays impressive generalization over different problem sizes (e.g., in BlockWorld, the method trained exclusively with 5 blocks still solves 78% of problems with 20 blocks) and readiness for curriculum learning.


robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

arXiv.org Artificial Intelligence

We introduce robosuite, a modular simulation framework and benchmark for robot learning. This framework is powered by the MuJoCo physics engine [15], which performs fast physical simulation of contact dynamics. The overarching goal of this framework is to facilitate research and development of data-driven robotic algorithms and techniques. The development of this framework was initiated from the SURREAL project [3] on distributed reinforcement learning for robot manipulation, and is now part of the broader Advancing Robot Intelligence through Simulated Environments (ARISE) Initiative, with the aim of lowering the barriers of entry for cutting-edge research at the intersection of AI and Robotics. Data-driven algorithms [9], such as reinforcement learning [13, 7] and imitation learning [12], provide a powerful and generic tool in robotics. These learning paradigms, fueled by new advances in deep learning, have achieved some exciting successes in a variety of robot control problems. Nonetheless, the challenges of reproducibility and the limited accessibility of robot hardware have impaired research progress [5]. In recent years, advances in physics-based simulations and graphics have led to a series of simulated platforms and toolkits [1, 14, 8, 2, 16] that have accelerated scientific progress on robotics and embodied AI. Through the robosuite project we aim to provide researchers with: 1. a modular design that offers great flexibility to create new robot simulation environments and tasks;