AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

A Reinforcement Learning Approach for Transient Control of Liquid Rocket Engines

Waxenegger-Wilfing, Günther, Dresia, Kai, Deeken, Jan Christian, Oschwald, Michael

arXiv.org Machine LearningJun-19-2020

Nowadays, liquid rocket engines use closed-loop control at most near steady operating conditions. The control of the transient phases is traditionally performed in open-loop due to highly nonlinear system dynamics. This situation is unsatisfactory, in particular for reusable engines. The open-loop control system cannot provide optimal engine performance due to external disturbances or the degeneration of engine components over time. In this paper, we study a deep reinforcement learning approach for optimal control of a generic gas-generator engine's continuous start-up phase. It is shown that the learned policy can reach different steady-state operating points and convincingly adapt to changing system parameters. A quantitative comparison with carefully tuned open-loop sequences and PID controllers is included. The deep reinforcement learning controller achieves the highest performance and requires only minimal computational effort to calculate the control action, which is a big advantage over approaches that require online optimization, such as model predictive control. control.

controller, survey article, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

2006.11108

Country:

Europe > Spain (0.28)
Europe > Germany (0.28)
North America > United States > California (0.28)
(2 more...)

Genre: Research Report (0.40)

Industry:

Aerospace & Defense (0.94)
Energy > Oil & Gas > Upstream (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration

Han, Shuai, Zhou, Wenbo, Liu, Jing, Lü, Shuai

arXiv.org Artificial IntelligenceJun-19-2020

Deep reinforcement learning has been applied more and more widely nowadays, especially in various complex control tasks. Effective exploration for noisy networks is one of the most important issues in deep reinforcement learning. Noisy networks tend to produce stable outputs for agents. However, this tendency is not always enough to find a stable policy for an agent, which decreases efficiency and stability during the learning process. Based on NoisyNets, this paper proposes an algorithm called NROWAN-DQN, i.e., Noise Reduction and Online Weight Adjustment NoisyNet-DQN. Firstly, we develop a novel noise reduction method for NoisyNet-DQN to make the agent perform stable actions. Secondly, we design an online weight adjustment strategy for noise reduction, which improves stable performance and gets higher scores for the agent. Finally, we evaluate this algorithm in four standard domains and analyze properties of hyper-parameters. Our results show that NROWAN-DQN outperforms prior algorithms in all these domains. In addition, NROWAN-DQN also shows better stability. The variance of the NROWAN-DQN score is significantly reduced, especially in some action-sensitive environments. This means that in some environments where high stability is required, NROWAN-DQN will be more appropriate than NoisyNets-DQN.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2006.1098

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Jilin Province (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Deep Reinforcement Learning for Human-Like Driving Policies in Collision Avoidance Tasks of Self-Driving Cars

Emuna, Ran, Borowsky, Avinoam, Biess, Armin

arXiv.org Machine LearningJun-19-2020

The technological and scientific challenges involved in the development of autonomous vehicles (AVs) are currently of primary interest for many automobile companies and research labs. However, human-controlled vehicles are likely to remain on the roads for several decades to come and may share with AVs the traffic environments of the future. In such mixed environments, AVs should deploy human-like driving policies and negotiation skills to enable smooth traffic flow. To generate automated human-like driving policies, we introduce a model-free, deep reinforcement learning approach to imitate an experienced human driver's behavior. We study a static obstacle avoidance task on a two-lane highway road in simulation (Unity). Our control algorithm receives a stochastic feedback signal from two sources: a model-driven part, encoding simple driving rules, such as lane-keeping and speed control, and a stochastic, data-driven part, incorporating human expert knowledge from driving data. To assess the similarity between machine and human driving, we model distributions of track position and speed as Gaussian processes. We demonstrate that our approach leads to human-like driving policies.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2006.04218

Country:

North America > United States (0.14)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.50)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension

Wang, Ruosong, Salakhutdinov, Ruslan, Yang, Lin F.

arXiv.org Machine LearningJun-19-2020

Value function approximation has demonstrated phenomenal empirical success in reinforcement learning (RL). Nevertheless, despite a handful of recent progress on developing theory for RL with linear function approximation, the understanding of general function approximation schemes largely remains missing. In this paper, we establish a provably efficient RL algorithm with general value function approximation. We show that if the value functions admit an approximation with a function class $\mathcal{F}$, our algorithm achieves a regret bound of $\widetilde{O}(\mathrm{poly}(dH)\sqrt{T})$ where $d$ is a complexity measure of $\mathcal{F}$ that depends on the eluder dimension [Russo and Van Roy, 2013] and log-covering numbers, $H$ is the planning horizon, and $T$ is the number interactions with the environment. Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment. Moreover, our algorithm is model-free and provides a framework to justify the effectiveness of algorithms used in practice.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2005.10804

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Optimizing Interactive Systems via Data-Driven Objectives

Li, Ziming, Kiseleva, Julia, Agarwal, Alekh, de Rijke, Maarten, White, Ryen W.

arXiv.org Artificial IntelligenceJun-19-2020

Effective optimization is essential for real-world interactive systems to provide a satisfactory user experience in response to changing user behavior. However, it is often challenging to find an objective to optimize for interactive systems (e.g., policy learning in task-oriented dialog systems). Generally, such objectives are manually crafted and rarely capture complex user needs in an accurate manner. We propose an approach that infers the objective directly from observed user interactions. These inferences can be made regardless of prior knowledge and across different types of user behavior. We introduce Interactive System Optimizer (ISO), a novel algorithm that uses these inferred objectives for optimization. Our main contribution is a new general principled approach to optimizing interactive systems using data-driven objectives. We demonstrate the high effectiveness of ISO over several simulations.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2006.12999

Country:

North America > United States (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
(4 more...)

Add feedback

On Reward-Free Reinforcement Learning with Linear Function Approximation

Wang, Ruosong, Du, Simon S., Yang, Lin F., Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceJun-19-2020

Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch RL setting and the setting where there are many reward functions of interest. During the exploration phase, an agent collects samples without using a pre-specified reward function. After the exploration phase, a reward function is given, and the agent uses samples collected during the exploration phase to compute a near-optimal policy. Jin et al. [2020] showed that in the tabular setting, the agent only needs to collect polynomial number of samples (in terms of the number states, the number of actions, and the planning horizon) for reward-free RL. However, in practice, the number of states and actions can be large, and thus function approximation schemes are required for generalization. In this work, we give both positive and negative results for reward-free RL with linear function approximation. We give an algorithm for reward-free RL in the linear Markov decision process setting where both the transition and the reward admit linear representations. The sample complexity of our algorithm is polynomial in the feature dimension and the planning horizon, and is completely independent of the number of states and actions. We further give an exponential lower bound for reward-free RL in the setting where only the optimal $Q$-function admits a linear representation. Our results imply several interesting exponential separations on the sample complexity of reward-free RL.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2006.11274

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.81)

Add feedback

Media, Arts and Design *.Conferences – Next edition: Media, Arts and Design

#artificialintelligenceJun-18-2020, 09:51:41 GMT

Please fill out the form below to submit an abstract (max.

evolutionary algorithm, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Country:

Europe > Middle East > Malta (0.07)
North America > United States > Texas (0.05)
North America > Saint Martin (0.04)
(7 more...)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Media (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Science Fiction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(8 more...)

Add feedback

Reinforcement Learning Tic Tac Toe Python Implementation

#artificialintelligenceJun-18-2020, 06:29:28 GMT

Reinforcement learning is a Machine Learning paradigm oriented on agents learning to take the best decisions in order to maximize a reward. It is a very popular type of Machine Learning algorithms because some view it as a way to build algorithms that act as close as possible to human beings: choosing the action at every step so that you get the highest reward possible. While in the other article we've explored the technical aspects of Reinforcement Learning, this time we will focus on the more practical aspects of the task. So let's jump right into the code. We will need to install only 2 dependencies for this one.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Tic-Tac-Toe (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

LEAF: Latent Exploration Along the Frontier

Bharadhwaj, Homanga, Garg, Animesh, Shkurti, Florian

arXiv.org Artificial IntelligenceJun-18-2020

Self-supervised goal proposal and reaching is a key component for exploration and efficient policy learning algorithms. Such a self-supervised approach without access to any oracle goal sampling distribution requires deep exploration and commitment so that long horizon plans can be efficiently discovered. In this paper, we propose an exploration framework, which learns a dynamics-aware manifold of reachable states. For a goal, our proposed method deterministically visits a state at the current frontier of reachable states (commitment/reaching) and then stochastically explores to reach the goal (exploration). This allocates exploration budget near the frontier of the reachable region instead of its interior. We target the challenging problem of policy learning from initial and goal states specified as images, and do not assume any access to the underlying ground-truth states of the robot and the environment. To keep track of reachable latent states, we propose a distance-conditioned reachability network that is trained to infer whether one state is reachable from another within the specified latent space distance. Given an initial state, we obtain a frontier of reachable states from that state. By incorporating a curriculum for sampling easier goals (closer to the start state) before more difficult goals, we demonstrate that the proposed self-supervised exploration algorithm, can achieve $20\%$ superior performance on average compared to existing baselines on a set of challenging robotic environments, including on a real robot manipulation task.

artificial intelligence, exploration, neural network, (18 more...)

arXiv.org Artificial Intelligence

2005.10934

Country: North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.46)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Compositional Generalization by Learning Analytical Expressions

Liu, Qian, An, Shengnan, Lou, Jian-Guang, Chen, Bei, Lin, Zeqi, Gao, Yan, Zhou, Bin, Zheng, Nanning, Zhang, Dongmei

arXiv.org Artificial IntelligenceJun-18-2020

Compositional generalization is a basic but essential intellective capability of human beings, which allows us to recombine known parts readily. However, existing neural network based models have been proven to be extremely deficient in such a capability. Inspired by work in cognition which argues compositionality can be captured by variable slots with symbolic functions, we present a refreshing view that connects a memory-augmented neural model with analytical expressions, to achieve compositional generalization. Our model consists of two cooperative neural modules Composer and Solver, fitting well with the cognitive argument while still being trained in an end-to-end manner via a hierarchical reinforcement learning algorithm. Experiments on a well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization, solving all challenges addressed by previous works with 100% accuracies.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2006.10627

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback