Authorial Idioms for Target Distributions in TTD-MDPs

AAAI Conferences

In designing Markov Decision Processes (MDP), one must define the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there is a clear choice of reward functions and in these cases significant care must be taken to construct a reward function that induces the desired behavior. In this paper, we consider an analogous design problem: crafting a target distribution in Targeted Trajectory Distribution MDPs (TTD-MDPs). TTD-MDPs produce probabilistic policies that minimize divergence from a target distribution of trajectories from an underlying MDP. They are an extension of MDPs that provide variety of experience during repeated execution. Here, we present a brief overview of TTD-MDPs with approaches for constructing target distributions. Then we present a novel authorial idiom for creating target distributions using prototype trajectories. We evaluate these approaches on a drama manager for an interactive game.


The 6 most exciting AI advances of 2016 - TechRepublic

#artificialintelligence

Google's AlphaGo beats Lee Sedol at the game of Go In 2016, major automakers like Tesla and Ford announced timelines for releasing fully-autonomous vehicles. DeepMind's AlphaGo, Google's AI system, beat the world champ Lee Sedol at one of the most complex board games in history. And other major advancements in AI have had big implications in healthcare, with some systems proving more effective in detecting cancer than human doctors. Want to learn what other cool things AI did in 2016? Here are TechRepublic's top picks.


Decoding the human brain

#artificialintelligence

CHENNAI: Google DeepMind's AlphaGo, an artificial intelligence programme developed using deep neural networks and machine learning techniques, hit global headlines last year when it beat South Korean Go grandmaster Lee Sedol to win the series 4-1. However, not many know that AlphaGo has consumed a whopping 30,000 watts of power to complete the task, while the human brain consumes around 20 watts! What gives the human brain such efficiency has so far proven elusive to replicate in computers. Not surprisingly, man's most defining organ is also the least understood. Although an adult human brain weighing 1.4 kg is made up of close to 100 billion neurons, scientists do not know how many different kinds of human neurons exist.


The Moral Imperative of Artificial Intelligence

#artificialintelligence

The big news on March 12 of this year was of the Go-playing AI-system AlphaGo securing victory against 18-time world champion Lee Se-dol by winning the third straight game of a five-game match in Seoul, Korea. After Deep Blue's victory against chess world champion Gary Kasparov in 1997, the game of Go was the next grand challenge for game-playing artificial intelligence. Go has defied the brute-force methods in game-tree search that worked so successfully in chess. In 2012, Communications published a Research Highlight article by Sylvain Gelly et al. on computer Go, which reported that "Programs based on Monte-Carlo tree search now play at human-master levels and are beginning to challenge top professional players." AlphaGo combines tree-search techniques with search-space reduction techniques that use deep learning.


The Moral Imperative of Artificial Intelligence

#artificialintelligence

The big news on March 12 of this year was of the Go-playing AI-system AlphaGo securing victory against 18-time world champion Lee Se-dol by winning the third straight game of a five-game match in Seoul, Korea. After Deep Blue's victory against chess world champion Gary Kasparov in 1997, the game of Go was the next grand challenge for game-playing artificial intelligence. Go has defied the brute-force methods in game-tree search that worked so successfully in chess. In 2012, Communications published a Research Highlight article by Sylvain Gelly et al. on computer Go, which reported that "Programs based on Monte-Carlo tree search now play at human-master levels and are beginning to challenge top professional players." AlphaGo combines tree-search techniques with search-space reduction techniques that use deep learning.