AITopics

2005.02057

Country:

Oceania > Australia (0.29)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Genre: Research Report (0.70)

Industry:

Government (0.68)
Education (0.68)
Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningMay-5-2020

Demand-Side Scheduling Based on Deep Actor-Critic Learning for Smart Grids

Lee, Joash, Wang, Wenbo, Niyato, Dusit

We consider the problem of demand-side energy management, where each household is equipped with a smart meter that is able to schedule home appliances online. The goal is to minimise the overall cost under a real-time pricing scheme. While previous works have introduced centralised approaches, we formulate the smart grid environment as a Markov game, where each household is a decentralised agent, and the grid operator produces a price signal that adapts to the energy demand. The main challenge addressed in our approach is partial observability and perceived non-stationarity of the environment from the viewpoint of each agent. We propose a multi-agent extension of a deep actor-critic algorithm that shows success in learning in this environment. This algorithm learns a centralised critic that coordinates training of all agents. Our approach thus uses centralised learning but decentralised execution. Simulation results show that our online deep reinforcement learning method can reduce both the peak-to-average ratio of total energy consumed and the cost of electricity for all households based purely on instantaneous observations and a price signal.

household, machine learning, reinforcement learning, (15 more...)

2005.01979

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Power Industry (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceMay-5-2020

Generalized Planning With Deep Reinforcement Learning

Rivlin, Or, Hazan, Tamir, Karpas, Erez

Classical Planning is concerned with finding plans, or sequences of actions, that when applied to some initial condition specified by a set of logical predicates, will bring the environment to a state that satisfies a set of goal predicates. This is usually performed by some heuristic search procedure, and the resulting plan is applicable only to the specific instance that was solved. However, a possibly stronger outcome would be to find some sort of higher level plan that can solve many instances that belong to the same domain, and thus share an underlying structure. The study of methods that can discover such higher level plans is called Generalized Planning. Generalized plans do not necessarily exist for all classical planning domains, but finding such solutions for domains in which it is possible could obviate the need to perform compute intensive search in cases where we only wish to find a goal satisfying solution. To give an example of such a generalized plan, let us consider a simplified Blocksworld domain. In this domain there are unique blocks that can be either stacked on each other or strewn about the floor, and the goal is to stack and unstack blocks such that we arrive at a goal configuration from an initial configuration. Finding a plan that does so in an optimal number of steps is generally NPhard [10], but finding a plan that satisfies the goal regardless of cost can be done in polynomial time in the following manner: 1. Unstack all the blocks so that they are scattered on the floor 2. stack the block according to the goal configuration, beginning with the lower blocks This strategy is not optimal since we might unstack blocks that are already in their proper place according to the goal specification, but it will yield a goal satisfying plan for every instance in this simplified Blocksworld domain.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2005.02305

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
Africa > Togo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Guerra, Anna, Guidi, Francesco, Dardari, Davide, Djuric, Petar M.

Reinforcement Learning for UAV Autonomous Navigation, Mapping and Target Detection

arXiv.org Machine LearningMay-5-2020

In this paper, we study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and, at the same time, to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection. This problem is formulated as a Markov decision process (MDP) where the UAV is an agent that runs either a state estimator for target detection and for environment mapping, and a reinforcement learning (RL) algorithm to infer its own policy of navigation (i.e., the control law). Numerical results show the feasibility of the proposed idea, highlighting the UAV's capability of autonomously exploring areas with high probability of target detection while reconstructing the surrounding environment.

eirp 5, machine learning, reinforcement learning, (17 more...)

doi: 10.1109/PLANS46316.2020.9110163

2005.05057

Country:

Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report (0.70)

Industry:

Information Technology > Robotics & Automation (0.34)
Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

arXiv.org Artificial IntelligenceMay-4-2020

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

Levine, Sergey, Kumar, Aviral, Tucker, George, Fu, Justin

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection. Offline reinforcement learning algorithms hold tremendous promise for making it possible to turn large datasets into powerful decision making engines. Effective offline reinforcement learning methods would be able to extract policies with the maximum possible utility out of the available data, thereby allowing automation of a wide range of decision-making domains, from healthcare and education to robotics. However, the limitations of current algorithms make this difficult. We will aim to provide the reader with an understanding of these challenges, particularly in the context of modern deep reinforcement learning methods, and describe some potential solutions that have been explored in recent work to mitigate these challenges, along with recent applications, and a discussion of perspectives on open problems in the field.

machine learning, reinforcement, reinforcement learning, (17 more...)

2005.01643

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation (1.00)
Health & Medicine > Therapeutic Area (1.00)
Information Technology (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

#artificialintelligenceMay-3-2020, 11:22:45 GMT

UC Berkeley researchers open-source RAD to improve any reinforcement learning algorithm

In an accompanying paper, the authors say this module can improve any existing reinforcement learning algorithm and that RAD achieves better compute and data efficiency than Google AI's PlaNet, as well as recently released cutting-edge algorithms like DeepMind's Dreamer and SLAC from UC Berkeley and DeepMind. RAD achieves state-of-the-art results on common benchmarks and matches or beats every baseline in terms of performance and data efficiency across 15 DeepMind control environments, the researchers say. It does this in part by applying data augmentations for visual observations. Coauthors of the paper on RAD include Michael "Misha" Laskin, Kimin Lee, and Berkeley AI Research codirector and Covariant founder Pieter Abbeel. RAD was released Thursday on preprint repository arXiv.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

#artificialintelligence

Country: North America > United States > California > Alameda County > Berkeley (0.06)

Genre:

Research Report (0.58)
Summary/Review (0.38)

Industry: Education > Educational Setting > Higher Education (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Taywade, Kshitija, Goldsmith, Judy, Harrison, Brent

Reinforcement Learning for Decentralized Stable Matching

arXiv.org Artificial IntelligenceMay-3-2020

When it comes to finding a match/partner in the real world, it is usually an independent and autonomous task performed by people/entities. For a person, a match can be several things such as a romantic partner, business partner, school, roommate, etc. Our purpose in this paper is to train autonomous agents to find suitable matches for themselves using reinforcement learning. We consider the decentralized two-sided stable matching problem, where an agent is allowed to have at most one partner at a time from the opposite set. Each agent receives some utility for being in a match with a member of the opposite set. We formulate the problem spatially as a grid world environment and having autonomous agents acting independently makes our environment very uncertain and dynamic. We run experiments with various instances of both complete and incomplete weighted preference lists for agents. Agents learn their policies separately, using separate training modules. Our goal is to train agents to find partners such that the outcome is a stable matching if one exists and also a matching with set-equality, meaning the outcome is approximately equally likable by agents from both the sets.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2005.01117

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Kentucky (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre:

Research Report (1.00)
Instructional Material (0.68)

Industry:

Leisure & Entertainment > Sports (0.46)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceMay-3-2020

Off-Policy Adversarial Inverse Reinforcement Learning

Arnob, Samin Yeasar

Adversarial Imitation Learning (AIL) is a class of algorithms in Reinforcement learning (RL), which tries to imitate an expert without taking any reward from the environment and does not provide expert behavior directly to the policy training. Rather, an agent learns a policy distribution that minimizes the difference from expert behavior in an adversarial setting. Adversarial Inverse Reinforcement Learning (AIRL) leverages the idea of AIL, integrates a reward function approximation along with learning the policy, and shows the utility of IRL in the transfer learning setting. But the reward function approximator that enables transfer learning does not perform well in imitation tasks. We propose an Off-Policy Adversarial Inverse Reinforcement Learning (Off-policy-AIRL) algorithm which is sample efficient as well as gives good imitation performance compared to the state-of-the-art AIL algorithm in the continuous control tasks. For the same reward function approximator, we show the utility of learning our algorithm over AIL by using the learned reward function to retrain the policy over a task under significant variation where expert demonstrations are absent.

generator, machine learning, reinforcement learning, (16 more...)

2005.01138

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceMay-3-2020

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Wang, Lin, Yoon, Kuk-Jin

Deep neural models in recent years have been successful in almost every field, including extremely complex problem statements. However, these models are huge in size, with millions (and even billions) of parameters, thus demanding more heavy computation power and failing to be deployed on edge devices. Besides, the performance boost is highly dependent on redundant labeled data. To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another. KD is often characterized by the so-called `Student-Teacher' (S-T) learning framework and has been broadly applied in model compression and knowledge transfer. This paper is about KD and S-T learning, which are being actively studied in recent years. First, we aim to provide explanations of what KD is and how/why it works. Then, we provide a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically for vision tasks. In general, we consider some fundamental questions that have been driving this research area and thoroughly generalize the research progress and technical details. Additionally, we systematically analyze the research status of KD in vision applications. Finally, we discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning.

knowledge management, machine learning, reinforcement learning, (20 more...)

2004.05937

Country:

North America > United States > California (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)
Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry: Education > Educational Setting > Online (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Knowledge Management (1.00)
Information Technology > Information Management (1.00)
(6 more...)

Haydari, Ammar, Yilmaz, Yasin

Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey

arXiv.org Machine LearningMay-2-2020

Latest technological improvements increased the quality of transportation. New data-driven approaches bring out a new research direction for all control-based systems, e.g., in transportation, robotics, IoT and power systems. Combining data-driven applications with transportation systems plays a key role in recent transportation applications. In this paper, the latest deep reinforcement learning (RL) based traffic control applications are surveyed. Specifically, traffic signal control (TSC) applications based on (deep) RL, which have been studied extensively in the literature, are discussed in detail. Different problem formulations, RL parameters, and simulation environments for TSC are discussed comprehensively. In the literature, there are also several autonomous driving applications studied with deep RL models. Our survey extensively summarizes existing works in this field by categorizing them with respect to application types, control models and studied algorithms. In the end, we discuss the challenges and open questions regarding deep RL-based transportation applications.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2005.00935

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)
(14 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)