AITopics

1905.02685

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning

Roibu, Andrei Claudiu

In order perform a large variety of tasks and to achieve human-level performance in complex real-world environments, Artificial Intelligence (AI) Agents must be able to learn from their past experiences and gain both knowledge and an accurate representation of their environment from raw sensory inputs. Traditionally, AI agents have suffered from difficulties in using only sensory inputs to obtain a good representation of their environment and then mapping this representation to an efficient control policy. Deep reinforcement learning algorithms have provided a solution to this issue. In this study, the performance of different conventional and novel deep reinforcement learning algorithms was analysed. The proposed method utilises two types of algorithms, one trained with a variant of Q-learning (DQN) and another trained with SARSA learning (DSN) to assess the feasibility of using direct feedback alignment, a novel biologically plausible method for back-propagating the error. These novel agents, alongside two similar agents trained with the conventional backpropagation algorithm, were tested by using the OpenAI Gym toolkit on several classic control theory problems and Atari 2600 video games. The results of this investigation open the way into new, biologically-inspired deep reinforcement learning algorithms, and their implementation on neuromorphic hardware.

agent, computer based training, computer game, (25 more...)

1905.04127

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.87)
Research Report > Experimental Study (0.67)

Industry:

Information Technology (0.93)
Energy > Oil & Gas (0.92)
Education > Educational Setting > Online (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Algorta, Simón, Şimşek, Özgür

The Game of Tetris in Machine Learning

The game of Tetris is an important benchmark for research in artificial intelligence and machine learning. This paper provides a historical account of the algorithmic developments in Tetris and discusses open challenges. Handcrafted controllers, genetic algorithms, and reinforcement learning have all contributed to good solutions. However, existing solutions fall far short of what can be achieved by expert players playing without time pressure. Further study of the game has the potential to contribute to important areas of research, including feature discovery, autonomous learning of action hierarchies, and sample-efficient reinforcement learning.

evolutionary algorithm, machine learning, reinforcement learning, (14 more...)

1905.01652

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Murad, Abdulmajid, Kraemer, Frank Alexander, Bach, Kerstin, Taylor, Gavin

Autonomous Management of Energy-Harvesting IoT Nodes Using Deep Reinforcement Learning

Reinforcement learning (RL) is capable of managing wireless, energy-harvesting IoT nodes by solving the problem of autonomous management in non-stationary, resource-constrained settings. We show that the state-of-the-art policy-gradient approaches to RL are appropriate for the IoT domain and that they outperform previous approaches. Due to the ability to model continuous observation and action spaces, as well as improved function approximation capability, the new approaches are able to solve harder problems, permitting reward functions that are better aligned with the actual application goals. We show such a reward function and use policy-gradient approaches to learn capable policies, leading to behavior more appropriate for IoT nodes with less manual design effort, increasing the level of autonomy in IoT.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1905.04181

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > New York (0.04)
North America > United States > Maryland > Anne Arundel County > Annapolis (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Energy > Energy Storage (0.73)
Energy > Renewable > Solar (0.70)
Energy > Power Industry (0.46)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Barati, Elaheh, Chen, Xuewen, Zhong, Zichun

Attention-based Deep Reinforcement Learning for Multi-view Environments

arXiv.org Machine LearningMay-10-2019

In reinforcement learning algorithms, it is a common practice to account for only a single view of the environment to make the desired decisions; however, utilizing multiple views of the environment can help to promote the learning of complicated policies. Since the views may frequently suffer from partial observability, their provided observation can have different levels of importance. In this paper, we present a novel attention-based deep reinforcement learning method in a multi-view environment in which each view can provide various representative information about the environment. Specifically, our method learns a policy to dynamically attend to views of the environment based on their importance in the decision-making process. We evaluate the performance of our method on TORCS racing car simulator and three other complex 3D environments with obstacles.

adrl, attention-based deep reinforcement learning, international conference, (11 more...)

1905.03985

Country:

North America > United States > Michigan > Wayne County > Detroit (0.05)
North America > Canada > Quebec > Montreal (0.05)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Padakandla, Sindhu, J, Prabuchandran K., Bhatnagar, Shalabh

Reinforcement Learning in Non-Stationary Environments

arXiv.org Machine LearningMay-10-2019

Reinforcement learning (RL) methods learn optimal decisions in the presence of a stationary environment. However, the stationary assumption on the environment is very restrictive. In many real world problems like traffic signal control, robotic applications, one often encounters situations with non-stationary environments and in these scenarios, RL methods yield sub-optimal decisions. In this paper, we thus consider the problem of developing RL methods that obtain optimal decisions in a non-stationary environment. The goal of this problem is to maximize the long-term discounted reward achieved when the underlying model of the environment changes over time. To achieve this, we first adapt a change point algorithm to detect change in the statistics of the environment and then develop an RL algorithm that maximizes the long-run reward accrued. We illustrate that our change point method detects change in the model of the environment effectively and thus facilitates the RL algorithm in maximizing the long-run reward. We further validate the effectiveness of the proposed solution on non-stationary random Markov decision processes, a sensor energy management problem and a traffic signal control problem.

agent, algorithm, changepoint, (15 more...)

1905.0397

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Transportation > Infrastructure & Services (0.69)
Transportation > Ground > Road (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Kamanchi, Chandramouli, Diddigi, Raghuram Bharadwaj, Bhatnagar, Shalabh

Second Order Value Iteration in Reinforcement Learning

arXiv.org Machine LearningMay-10-2019

Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive at the optimal solution. Value iteration is a first order method and therefore it may take a large number of iterations to converge to the optimal solution. In this work, we propose a novel second order value iteration procedure based on the Newton-Raphson method. We first construct a modified contraction operator and then apply Newton-Raphson method to arrive at our algorithm. We prove the global convergence of our algorithm to the optimal solution and show the second order convergence. Through experiments, we demonstrate the effectiveness of our proposed approach.

algorithm, iteration, value function, (13 more...)

1905.03927

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Rudolph, Stefan, Tomforde, Sven, Hähner, Jörg

On the Detection of Mutual Influences and Their Consideration in Reinforcement Learning Processes

Self-adaptation has been proposed as a mechanism to counter complexity in control problems of technical systems. A major driver behind self-adaptation is the idea to transfer traditional design-time decisions to runtime and into the responsibility of systems themselves. In order to deal with unforeseen events and conditions, systems need creativity -- typically realized by means of machine learning capabilities. Such learning mechanisms are based on different sources of knowledge. Feedback from the environment used for reinforcement purposes is probably the most prominent one within the self-adapting and self-organizing (SASO) systems community. However, the impact of other (sub-)systems on the success of the individual system's learning performance has mostly been neglected in this context. In this article, we propose a novel methodology to identify effects of actions performed by other systems in a shared environment on the utility achievement of an autonomous system. Consider smart cameras (SC) as illustrating example: For goals such as 3D reconstruction of objects, the most promising configuration of one SC in terms of pan/tilt/zoom parameters depends largely on the configuration of other SCs in the vicinity. Since such mutual influences cannot be pre-defined for dynamic systems, they have to be learned at runtime. Furthermore, they have to be taken into consideration when self-improving the own configuration decisions based on a feedback loop concept, e.g., known from the SASO domain or the Autonomic and Organic Computing initiatives. We define a methodology to detect such influences at runtime, present an approach to consider this information in a reinforcement learning technique, and analyze the behavior in artificial as well as real-world SASO system settings.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

1905.04205

Country:

North America > United States (0.46)
North America > Canada (0.45)
Europe > Germany (0.28)
Europe > Portugal (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Optimizing Routerless Network-on-Chip Designs: An Innovative Learning-Based Framework

Lin, Ting-Ru, Penney, Drew, Pedram, Massoud, Chen, Lizhong

Machine learning applied to architecture design presents a promising opportunity with broad applications. Recent deep reinforcement learning (DRL) techniques, in particular, enable efficient exploration in vast design spaces where conventional design strategies may be inadequate. This paper proposes a novel deep reinforcement framework, taking routerless networks-on-chip (NoC) as an evaluation case study. The new framework successfully resolves problems with prior design approaches being either unreliable due to random searches or inflexible due to severe design space restrictions. The framework learns (near-)optimal loop placement for routerless NoCs with various design constraints. A deep neural network is developed using parallel threads that efficiently explore the immense routerless NoC design space with a Monte Carlo search tree. Experimental results show that, compared with conventional mesh, the proposed deep reinforcement learning (DRL) routerless design achieves a 3.25x increase in throughput, 1.6x reduction in packet latency, and 5x reduction in power. Compared with the state-of-the-art routerless NoC, DRL achieves a 1.47x increase in throughput, 1.18x reduction in packet latency, and 1.14x reduction in average hop count albeit with slightly more power overhead.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1905.04423

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Industry: Semiconductors & Electronics (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Hahn, Carsten, Phan, Thomy, Gabor, Thomas, Belzner, Lenz, Linnhoff-Popien, Claudia

Emergent Escape-based Flocking Behavior using Multi-Agent Reinforcement Learning

In nature, flocking or swarm behavior is observed in many species as it has beneficial properties like reducing the probability of being caught by a predator. In this paper, we propose SELFish (Swarm Emergent Learning Fish), an approach with multiple autonomous agents which can freely move in a continuous space with the objective to avoid being caught by a present predator. The predator has the property that it might get distracted by multiple possible preys in its vicinity. We show that this property in interaction with self-interested agents which are trained with reinforcement learning to solely survive as long as possible leads to flocking behavior similar to Boids, a common simulation for flocking behavior. Furthermore we present interesting insights in the swarming behavior and in the process of agents being caught in our modeled environment.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1905.04077

Country: Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)