AITopics

1903.01599

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (0.47)
Education (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Canaan, Rodrigo, Salge, Christoph, Togelius, Julian, Nealen, Andy

Leveling the Playing Field - Fairness in AI Versus Human Game Benchmarks

arXiv.org Artificial IntelligenceMar-16-2019

From the beginning if the history of AI, there has been interest in games as a platform of research. As the field developed, human-level competence in complex games became a target researchers worked to reach. Only relatively recently has this target been finally met for traditional tabletop games such as Backgammon, Chess and Go. Current research focus has shifted to electronic games, which provide unique challenges. As is often the case with AI research, these results are liable to be exaggerated or misrepresented by either authors or third parties. The extent to which these games benchmark consist of fair competition between human and AI is also a matter of debate. In this work, we review the statements made by authors and third parties in the general media and academic circle about these game benchmark results and discuss factors that can impact the perception of fairness in the contest between humans and machines

artificial intelligence, machine learning, reinforcement learning, (20 more...)

1903.07008

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Leisure & Entertainment > Games > Chess (1.00)

Technology:

Information Technology > Artificial Intelligence > Games (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Artificial IntelligenceMar-16-2019

Discovering Options for Exploration by Minimizing Cover Time

Jinnai, Yuu, Park, Jee Won, Abel, David, Konidaris, George

Finding a set of edges that minimizes expected One of the main challenges in reinforcement learning cover time is an extremely hard combinatorial optimization is solving tasks with sparse reward. We show problem (Braess, 1968; Braess et al., 2005). Thus, our that the difficulty of discovering a distant rewarding algorithm instead seeks to minimize the upper bound of the state in an MDP is bounded by the expected expected cover time given as a function of the algebraic cover time of a random walk over the graph induced connectivity of the graph Laplacian (Fiedler, 1973; Broder by the MDP's transition dynamics. We & Karlin, 1989; Chung, 1996) using the heuristic method therefore propose to accelerate exploration by constructing by Ghosh & Boyd (2006) that improves the upper bound of options that minimize cover time. The the expected cover time of a uniform random walk.

cover time, machine learning, reinforcement learning, (14 more...)

1903.00606

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Balevi, Eren, Andrews, Jeffrey G.

Online Antenna Tuning in Heterogeneous Cellular Networks with Deep Reinforcement Learning

We aim to jointly optimize the antenna tilt angle, and the vertical and horizontal half-power beamwidths of the macrocells in a heterogeneous cellular network (HetNet) via a synergistic combination of deep learning (DL) and reinforcement learning (RL). The interactions between the cells, most notably due to their coupled interference and the large number of users, renders this optimization problem prohibitively complex. This makes the proposed deep RL technique attractive as a practical online solution for real deployments, which should automatically adapt to new base stations being added and other environmental changes in the network. In the proposed algorithm, DL is used to extract the features by learning the locations of the users, and mean field RL is used to learn the average interference values for different antenna settings. Our results illustrate that the proposed deep RL algorithm can approach the optimum weighted sum rate with hundreds of online trials, as opposed to millions of trials for standard Q-learning, assuming relatively low environmental dynamics. Furthermore, the proposed algorithm is compact and implementable, and empirically appears to provide a performance guarantee regardless of the amount of environmental dynamics.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1903.06787

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Artificial IntelligenceMar-15-2019

AI2-THOR: An Interactive 3D Environment for Visual AI

Kolve, Eric, Mottaghi, Roozbeh, Han, Winson, VanderBilt, Eli, Weihs, Luca, Herrasti, Alvaro, Gordon, Daniel, Zhu, Yuke, Gupta, Abhinav, Farhadi, Ali

We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor.allenai.org. AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks. AI2-THOR enables research in many different domains including but not limited to deep reinforcement learning, imitation learning, learning by interaction, planning, visual question answering, unsupervised representation learning, object detection and segmentation, and learning models of cognition. The goal of AI2-THOR is to facilitate building visually intelligent models and push the research forward in this domain.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1712.05474

Country: Europe > Sweden > Skåne County > Malmö (0.05)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset

Zhang, Ruohan, Liu, Zhuode, Guan, Lin, Zhang, Luxin, Hayhoe, Mary M, Ballard, Dana H

Additionally, previous research has shown that and eye movements while playing Atari videos games. The given a task context, human visual attention is modulated dataset currently has 44 hours of gameplay data from 16 by reward [5, 9, 17]. In performing a familiar task, objects games and a total of 2.97 million demonstrated actions. Human with high potential reward or penalty attracts human attention subjects played games in a frame-by-frame manner to hence gaze indicates the momentary attentional priorities allow enough decision time in order to obtain near-optimal over multiple objects. Therefore the gaze could be a decisions. This dataset could be potentially used for research potentially useful intermediate learning signal for imitation in imitation learning, reinforcement learning, and learning.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1903.06754

Country: North America > United States > Texas (0.17)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)

Mason, Karl, Grijalva, Santiago

A Review of Reinforcement Learning for Autonomous Building Energy Management

The area of building energy management has received a significant amount of interest in recent years. This area is concerned with combining advancements in sensor technologies, communications and advanced control algorithms to optimize energy utilization. Reinforcement learning is one of the most prominent machine learning algorithms used for control problems and has had many successful applications in the area of building energy management. This research gives a comprehensive review of the literature relating to the application of reinforcement learning to developing autonomous building energy management systems. The main direction for future research and challenges in reinforcement learning are also outlined.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1903.05196

Country:

North America > United States (0.68)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.34)

Industry:

Transportation > Ground > Road (1.00)
Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Kamanchi, Chandramouli, Diddigi, Raghuram Bharadwaj, Bhatnagar, Shalabh

Successive Over Relaxation Q-Learning

In a discounted reward Markov Decision Process (MDP) the objective is to find the optimal value function, i.e., the value function corresponding to an optimal policy. This problem reduces to solving a functional equation known as the Bellman equation and a fixed point iteration scheme known as the value iteration is utilized to obtain the solution. In [1], a successive over-relaxation based value iteration scheme is proposed to speed up the computation of the optimal value function. They propose a modified Bellman equation and prove faster convergence to the optimal value function. However, in many practical applications, the model information is not known and we resort to Reinforcement Learning (RL) algorithms to obtain optimal policy and value function. One such popular algorithm is Q-Learning. In this paper, we propose Successive Over Relaxation (SOR) Q-Learning. We first derive a fixed point iteration for optimal Q-values based on [1] and utilize stochastic approximation to derive a learning algorithm to compute the optimal value function and an optimal policy. We then prove the convergence of the SOR Q-Learning to optimal Q-values. Finally, through numerical experiments, we show that SOR Q-Learning is faster compared to the standard Q-Learning algorithm.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1903.03812

Country: Asia > India (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

arXiv.org Artificial IntelligenceMar-14-2019

Adaptive Variance for Changing Sparse-Reward Environments

Lin, Xingyu, Guo, Pengsheng, Florensa, Carlos, Held, David

Robots that are trained to perform a task in a fixed environment often fail when facing unexpected changes to the environment due to a lack of exploration. We propose a principled way to adapt the policy for better exploration in changing sparse-reward environments. Unlike previous works which explicitly model environmental changes, we analyze the relationship between the value function and the optimal exploration for a Gaussian-parameterized policy and show that our theory leads to an effective strategy for adjusting the variance of the policy, enabling fast adapt to changes in a variety of sparse-reward environments.

machine learning, reinforcement learning, variance, (18 more...)

1903.06309

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Ozaki, Yasunori, Ishihara, Tatsuya, Matsumura, Narimune, Nunobiki, Tadashi

Can Robot Attract Passersby without Causing Discomfort by User-Centered Reinforcement Learning?

arXiv.org Artificial IntelligenceMar-14-2019

The aim of our study was to develop a method by which a social robot can greet passersby and get their attention without causing them to suffer discomfort.A number of customer services have recently come to be provided by social robots rather than people, including, serving as receptionists, guides, and exhibitors. Robot exhibitors, for example, can explain products being promoted by the robot owners. However, a sudden greeting by a robot can startle passersby and cause discomfort to passersby.Social robots should thus adapt their mannerisms to the situation they face regarding passersby.We developed a method for meeting this requirement on the basis of the results of related work. Our proposed method, user-centered reinforcement learning, enables robots to greet passersby and get their attention without causing them to suffer discomfort (p<0.01) .The results of an experiment in the field, an office entrance, demonstrated that our method meets this requirement.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1903.05881

Country: Asia > Japan (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)