AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Towards Better Opioid Antagonists Using Deep Reinforcement Learning

Deng, Jianyuan, Yang, Zhibo, Li, Yao, Samaras, Dimitris, Wang, Fusheng

arXiv.org Artificial IntelligenceMar-26-2020

Naloxone, an opioid antagonist, has been widely used to save lives from opioid overdose, a leading cause for death in the opioid epidemic. However, naloxone has short brain retention ability, which limits its therapeutic efficacy. Developing better opioid antagonists is critical in combating the opioid epidemic.Instead of exhaustively searching in a huge chemical space for better opioid antagonists, we adopt reinforcement learning which allows efficient gradient-based search towards molecules with desired physicochemical and/or biological properties. Specifically, we implement a deep reinforcement learning framework to discover potential lead compounds as better opioid antagonists with enhanced brain retention ability. A customized multi-objective reward function is designed to bias the generation towards molecules with both sufficient opioid antagonistic effect and enhanced brain retention ability. Thorough evaluation demonstrates that with this framework, we are able to identify valid, novel and feasible molecules with multiple desired properties, which has high potential in drug discovery.

generative model, molecule, smile string, (14 more...)

arXiv.org Artificial Intelligence

2004.04768

Country: North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

AirRL: A Reinforcement Learning Approach to Urban Air Quality Inference

Zhong, Huiqiang, Yin, Cunxiang, Wu, Xiaohui, Luo, Jinchang, He, JiaWei

arXiv.org Artificial IntelligenceMar-26-2020

Urban air pollution has become a major environmental problem that threatens public health. It has become increasingly important to infer fine-grained urban air quality based on existing monitoring stations. One of the challenges is how to effectively select some relevant stations for air quality inference. In this paper, we propose a novel model based on reinforcement learning for urban air quality inference. The model consists of two modules: a station selector and an air quality regressor. The station selector dynamically selects the most relevant monitoring stations when inferring air quality. The air quality regressor takes in the selected stations and makes air quality inference with deep neural network. We conduct experiments on a real-world air quality dataset and our approach achieves the highest performance compared with several popular solutions, and the experiments show significant effectiveness of proposed model in tackling problems of air quality inference.

air quality, inference, target location, (13 more...)

arXiv.org Artificial Intelligence

2003.12205

Country:

Asia > India (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.88)
Law > Environmental Law (0.34)
Transportation > Ground (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence

Gupta, Abhishek, Haskell, William B.

arXiv.org Machine LearningMar-25-2020

This paper develops a unified framework, based on iterated random operator theory, to analyze the convergence of constant stepsize recursive stochastic algorithms (RSAs) in machine learning and reinforcement learning. RSAs use randomization to efficiently compute expectations, and so their iterates form a stochastic process. The key idea is to lift the RSA into an appropriate higher-dimensional space and then express it as an equivalent Markov chain. Instead of determining the convergence of this Markov chain (which may not converge under constant stepsize), we study the convergence of the distribution of this Markov chain. To study this, we define a new notion of Wasserstein divergence. We show that if the distribution of the iterates in the Markov chain satisfy certain contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution. Inspired by the SVRG algorithm, we develop a method to convert any RSA to a variance reduced RSA that converges to the optimal solution with in almost sure sense or in probability. We show that convergence of a large family of constant stepsize RSAs can be understood using this framework. We apply this framework to ascertain the convergence of mini-batch SGD, forward-backward splitting with catalyst, SVRG, SAGA, empirical Q value iteration, synchronous Q-learning, enhanced policy iteration, and MDPs with a generative model. We also develop two new algorithms for reinforcement learning and establish their convergence using this framework.

algorithm, divergence, iteration, (15 more...)

arXiv.org Machine Learning

2003.11403

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Adaptive Conditional Neural Movement Primitives via Representation Sharing Between Supervised and Reinforcement Learning

Akbulut, M. Tuluhan, Seker, M. Yunus, Tekden, Ahmet E., Nagai, Yukie, Oztop, Erhan, Ugur, Emre

arXiv.org Artificial IntelligenceMar-25-2020

Learning by Demonstration provides a sample efficient way to equip robots with complex sensorimotor skills in supervised manner. Several movement primitive representations can be used for flexible motor representation and learning. A recent state-of-the art approach is Conditional Neural Movement Primitives (CNMP) that can learn non-linear relations between environment parameters and complex multi-modal trajectories from a few expert demonstrations by forming powerful latent space representations. In this study, to improve the applicability of CNMP to changing tasks and/or environments, we couple it with a reinforcement learning agent that exploits the formed representations by the original CNMP network, and learns to generate synthetic demonstrations for further learning. This enables the CNMP network to generalize to new environments by adapting its internal representations. In the current implementation, the reinforcement learning agent is triggered when a failure in task execution is detected, and the CNMP is trained with the newly discovered demonstration (trajectory), which shares essential characteristics with the original demonstrations due to the representation sharing. As a result, the overall system increases its capacity and handle situations in scenarios where the initial CNMP network can not produce a useful trajectory. To show the validity of our proposed model, we compare our approach with original CNMP work and other movement primitives approaches. Furthermore, we presents the experimental results from the implementation of the proposed model on real robotics setups, which indicate the applicability of our approach as an effective adaptive learning by demonstration system.

demonstration, representation, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2003.11334

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The 10 Best Free Artificial Intelligence And Machine Learning Courses for 2020

#artificialintelligenceMar-24-2020, 09:53:11 GMT

The demand for people with knowledge and skills in artificial intelligence (AI) and machine learning (ML) hugely outstrips the supply. This means that learning and gaining qualifications in these subjects can be a great way to enhance your career prospects. However, not everyone has the spare time and money to spend years studying for a degree or other formal qualifications. Today, with the wealth of freely available educational content online, it may not be necessary. There are so many courses, tutorials, and guides available online that it is perfectly possible to gain a thorough grounding in these subjects without paying a penny.

artificial intelligence, best free artificial intelligence, intelligence and machine learning course, (10 more...)

#artificialintelligence

Country: Europe > Finland > Uusimaa > Helsinki (0.05)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting > Online (0.50)
Education > Educational Technology > Educational Software > Computer Based Training (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

Reinforcement Learning: The Algorithms Changing How Computers Make Decisions

#artificialintelligenceMar-24-2020, 03:20:55 GMT

The last decade of tech was to a large part defined by the advent of Deep Supervised Learning (DL). The availability of cheap data at scale, computational power, and researcher interest have made it the de-facto school of algorithms used for most pattern recognition problems. Face recognition on social media, product recommendations on sites, voice assistants like Google Assistant, Alexa, and Siri are some examples largely powered by DL. The issue with deep learning is that the resources that led to its rise are also giving rise to inequities. Today, it is tough for startups to beat'big tech' like Apple, Google, Amazon, and Microsoft in deep learning through better research capabilities or better data.

computer make decision, learning, reinforcement learning, (6 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Add feedback

Distributional Reinforcement Learning with Ensembles

Lindenberg, Björn, Nordqvist, Jonas, Lindahl, Karl-Olof

arXiv.org Artificial IntelligenceMar-24-2020

It is well-known that ensemble methods often provide enhanced performance in reinforcement learning. In this paper we explore this concept further by using group-aided training within the distributional reinforcement learning paradigm. Specifically, we propose an extension to categorical reinforcement learning, where distributional learning targets are implicitly based on the total information gathered by an ensemble. We empirically show that this may lead to much more robust initial learning, a stronger individual performance level and good efficiency on a per-sample basis.

agent, ensemble, learning, (13 more...)

arXiv.org Artificial Intelligence

2003.10903

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods

Zhi, Jiale, Wang, Rui, Clune, Jeff, Stanley, Kenneth O.

arXiv.org Machine LearningMar-24-2020

Recent advances in machine learning are consistently enabled by increasing amounts of computation. Reinforcement learning (RL) and population-based methods in particular pose unique challenges for efficiency and flexibility to the underlying distributed computing frameworks. These challenges include frequent interaction with simulations, the need for dynamic scaling, and the need for a user interface with low adoption cost and consistency across different backends. In this paper we address these challenges while still retaining development efficiency and flexibility for both research and practical applications by introducing Fiber, a scalable distributed computing framework for RL and population-based methods. Fiber aims to significantly expand the accessibility of large-scale parallel computation to users of otherwise complicated RL and population-based approaches without the need to for specialized computational expertise.

algorithm, application, fiber, (13 more...)

arXiv.org Machine Learning

2003.11164

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

An empirical investigation of the challenges of real-world reinforcement learning

Dulac-Arnold, Gabriel, Levine, Nir, Mankowitz, Daniel J., Li, Jerry, Paduraru, Cosmin, Gowal, Sven, Hester, Todd

arXiv.org Artificial IntelligenceMar-24-2020

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called realworldrl-suite which we propose an as an open-source benchmark.

agent, algorithm, constraint, (12 more...)

arXiv.org Artificial Intelligence

2003.11881

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Industry:

Transportation (0.67)
Leisure & Entertainment > Games (0.67)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Mousavi, Ali, Li, Lihong, Liu, Qiang, Zhou, Denny

arXiv.org Artificial IntelligenceMar-24-2020

Off-policy estimation for long-horizon problems is important in many real-life applications such as healthcare and robotics, where high-fidelity simulators may not be available and on-policy evaluation is expensive or impossible. Recently, \cite{liu18breaking} proposed an approach that avoids the \emph{curse of horizon} suffered by typical importance-sampling-based methods. While showing promising results, this approach is limited in practice as it requires data be drawn from the \emph{stationary distribution} of a \emph{known} behavior policy. In this work, we propose a novel approach that eliminates such limitations. In particular, we formulate the problem as solving for the fixed point of a certain operator. Using tools from Reproducing Kernel Hilbert Spaces (RKHSs), we develop a new estimator that computes importance ratios of stationary distributions, without knowledge of how the off-policy data are collected. We analyze its asymptotic consistency and finite-sample generalization. Experiments on benchmarks verify the effectiveness of our approach.

behavior policy, stationary distribution, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2003.11126

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.66)
Transportation > Air (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback