AITopics | Kartal, Bilal

Collaborating Authors

Kartal, Bilal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman

Gao, Chao, Kartal, Bilal, Hernandez-Leal, Pablo, Taylor, Matthew E.

arXiv.org Artificial IntelligenceJul-26-2019

How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL --- past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the environment's complexity. In this paper, we illuminate reasons behind this failure by providing a thorough analysis on the hardness of random exploration in Pommerman. While model-free random exploration is typically futile, we develop a model-based automatic reasoning module that can be used for safer exploration by pruning actions that will surely lead the agent to death. We empirically demonstrate that this module can significantly improve learning.

agent, artificial intelligence, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1907.11788

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Action Guidance with MCTS for Deep Reinforcement Learning

Kartal, Bilal, Hernandez-Leal, Pablo, Taylor, Matthew E.

arXiv.org Machine LearningJul-25-2019

Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number rollouts, can be integrated within asynchronous distributed deep reinforcement learning methods. Compared to a vanilla deep RL algorithm, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. Introduction Deep reinforcement learning (DRL) has enabled better scalability and generalization for challenging domains (Arulku-maran et al. 2017; Li 2017; Hernandez-Leal, Kartal, and Taylor 2018) such as Atari games (Mnih et al. 2015), Go (Silver et al. 2016) and multiagent games (e.g., Starcraft II and DOT A 2) (OpenAI 2018). However, one of the current biggest challenges for DRL is sample efficiency (Y u 2018). On the one hand, once a DRL agent is trained, it can be deployed to act in real-time by only performing an inference through the trained model. On the other hand, planning methods such as Monte Carlo tree search (MCTS) (Browne et al. 2012) do not have a training phase, but they perform computationally costly simulation based rollouts (assuming access to a simulator) to find the best action to take. There are several ways to get the best of both DRL and search methods.

computer game, demonstrator, survey article, (19 more...)

arXiv.org Machine Learning

1907.11703

Country: North America > Canada (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Terminal Prediction as an Auxiliary Task for Deep Reinforcement Learning

Kartal, Bilal, Hernandez-Leal, Pablo, Taylor, Matthew E.

arXiv.org Machine LearningJul-24-2019

Deep reinforcement learning has achieved great successes in recent years, but there are still open challenges, such as convergence to locally optimal policies and sample inefficiency. In this paper, we contribute a novel self-supervised auxiliary task, i.e., Terminal Prediction (TP), estimating temporal closeness to terminal states for episodic tasks. The intuition is to help representation learning by letting the agent predict how close it is to a terminal state, while learning its control policy. Although TP could be integrated with multiple algorithms, this paper focuses on Asynchronous Advantage Actor-Critic (A3C) and demonstrating the advantages of A3C-TP. Our extensive evaluation includes: a set of Atari games, the BipedalWalker domain, and a mini version of the recently proposed multi-agent Pommerman game. Our results on Atari games and the BipedalWalker domain suggest that A3C-TP outperforms standard A3C in most of the tested domains and in others it has similar performance. In Pommerman, our proposed method provides significant improvement both in learning efficiency and converging to better policies against different opponents.

computer game, deep learning, neural network, (22 more...)

arXiv.org Machine Learning

1907.10827

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team Competition

Gao, Chao, Hernandez-Leal, Pablo, Kartal, Bilal, Taylor, Matthew E.

arXiv.org Artificial IntelligenceApr-20-2019

The Pommerman Team Environment is a recently proposed benchmark which involves a multi-agent domain with challenges such as partial observability, decentralized execution (without communication), and very sparse and delayed rewards. The inaugural Pommerman Team Competition held at NeurIPS 2018 hosted 25 participants who submitted a team of 2 agents. Our submission nn_team_skynet955_skynet955 won 2nd place of the "learning agents'' category. Our team is composed of 2 neural networks trained with state of the art deep reinforcement learning algorithms and makes use of concepts like reward shaping, curriculum learning, and an automatic reasoning module for action pruning. Here, we describe these elements and additionally we present a collection of open-sourced agents that can be used for training and testing in the Pommerman environment. Code available at: https://github.com/BorealisAI/pommerman-baseline

agent, artificial intelligence, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1905.0136

Country: North America > Canada > Alberta (0.29)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Safer Deep RL with Shallow MCTS: A Case Study in Pommerman

Kartal, Bilal, Hernandez-Leal, Pablo, Gao, Chao, Taylor, Matthew E.

arXiv.org Artificial IntelligenceApr-10-2019

Safe reinforcement learning has many variants and it is still an open research problem. Here, we focus on how to use action guidance by means of a non-expert demonstrator to avoid catastrophic events in a domain with sparse, delayed, and deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for reinforcement learning (RL) --- past work has shown that model-free RL algorithms fail to achieve significant learning. In this paper, we shed light into the reasons behind this failure by exemplifying and analyzing the high rate of catastrophic events (i.e., suicides) that happen under random exploration in this domain. While model-free random exploration is typically futile, we propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with small number of rollouts, can be integrated to asynchronous distributed deep reinforcement learning methods. Compared to vanilla deep RL algorithms, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.

computer game, demonstrator, survey article, (19 more...)

arXiv.org Artificial Intelligence

1904.05759

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Kartal, Bilal, Hernandez-Leal, Pablo, Taylor, Matthew E.

arXiv.org Artificial IntelligenceNov-30-2018

Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power. However, there are still several challenges to be addressed such as convergence to locally optimal policies and long training times. In this paper, firstly, we augment Asynchronous Advantage Actor-Critic (A3C) method with a novel self-supervised auxiliary task, i.e. \emph{Terminal Prediction}, measuring temporal closeness to terminal states, namely A3C-TP. Secondly, we propose a new framework where planning algorithms such as Monte Carlo tree search or other sources of (simulated) demonstrators can be integrated to asynchronous distributed DRL methods. Compared to vanilla A3C, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.

computer game, demonstrator, neural network, (21 more...)

arXiv.org Artificial Intelligence

1812.00045

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Go (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

Is multiagent deep reinforcement learning the answer or the question? A brief survey

Hernandez-Leal, Pablo, Kartal, Bilal, Taylor, Matthew E.

arXiv.org Artificial IntelligenceOct-12-2018

Deep reinforcement learning (DRL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be addressed. In this context, first, this article provides a clear overview of current multiagent deep reinforcement learning (MDRL) literature. Second, it provides guidelines to complement this emerging area by (i) showcasing examples on how methods and algorithms from DRL and multiagent learning (MAL) have helped solve problems in MDRL and (ii) providing general lessons learned from these works. We expect this article will help unify and motivate future research to take advantage of the abundant literature that exists in both areas (DRL and MAL) in a joint effort to promote fruitful research in the multiagent community.

computer game, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1810.05587

Country:

Europe (0.93)
North America > United States > Massachusetts (0.28)
North America > Canada > Alberta (0.28)

Genre: Overview (1.00)

Industry:

Education (0.67)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Data Driven Sokoban Puzzle Generation with Monte Carlo Tree Search

Kartal, Bilal (University of Minnesota) | Sohre, Nick (University of Minnesota) | Guy, Stephen J. (University of Minnesota)

AAAI ConferencesOct-4-2016

In this work, we propose a Monte Carlo Tree Search (MCTS) based approach to procedurally generate Sokoban puzzles. Our method generates puzzles through simulated game play, guaranteeing solvability in all generated puzzles. We perform a user study to infer features that are efficient to compute and are highly correlated with expected puzzle difficulty. We combine several of these features into a data-driven evaluation function for MCTS puzzle creation. The resulting algorithm is efficient and can be run in an anytime manner, capable of quickly generating a variety of challenging puzzles. We perform a second user study to validate the predictive capability of our approach, showing a high correlation between increasing puzzle scores and perceived difficulty.

data driven sokoban puzzle generation, monte carlo tree search

AAAI Conferences

Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.60)

Add feedback

Monte Carlo Tree Search for Multi-Robot Task Allocation

Kartal, Bilal (University of Minnesota) | Nunes, Ernesto (University of Minnesota) | Godoy, Julio (University of Minnesota) | Gini, Maria (University of Minnesota)

AAAI ConferencesApr-19-2016

Multi-robot teams are useful in a variety of task allocation domains such as warehouse automation and surveillance. Robots in such domains perform tasks at given locations and specific times, and are allocated tasks to optimize given team objectives. We propose an efficient, satisficing and centralized Monte Carlo TreeSearch based algorithm exploiting branch and bound paradigm to solve the multi-robot task allocation problem with spatial, temporal and other side constraints. Unlike previous heuristics proposed for this problem, our approach offers theoretical guarantees and finds optimal solutions for some non-trivial data sets.

monte carlo tree search, planning & scheduling, robot, (18 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)

Add feedback

Generating Believable Stories in Large Domains

Kartal, Bilal (University of Minnesota) | Koenig, John (University of Minnesota) | Guy, Stephen J. (University of Minnesota)

AAAI ConferencesNov-10-2013

Planning-based techniques are a very powerful tool for automated story generation. However, as the number of possible actions increases, traditional planning techniques suffer from a combinatorial explosion due to large branching factors. In this work, we apply Monte Carlo Tree Search (MCTS) techniques to generate stories in domains with large numbers of possible actions (100+). Our approach employs a Bayesian story evaluation method to guide the planning towards believable stories that reach a user defined goal. We generate stories in a novel domain with different type of story goals. Our approach shows an order of magnitude improvement in performance over traditional search techniques.

generating believable story

AAAI Conferences

Ninth Artificial Intelligence and Interactive Digital Entertainment Conference

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.53)

Add feedback