Google's AlphaGo beats Lee Sedol at the game of Go In 2016, major automakers like Tesla and Ford announced timelines for releasing fully-autonomous vehicles. DeepMind's AlphaGo, Google's AI system, beat the world champ Lee Sedol at one of the most complex board games in history. And other major advancements in AI have had big implications in healthcare, with some systems proving more effective in detecting cancer than human doctors. Want to learn what other cool things AI did in 2016? Here are TechRepublic's top picks.
In designing Markov Decision Processes (MDP), one must define the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there is a clear choice of reward functions and in these cases significant care must be taken to construct a reward function that induces the desired behavior. In this paper, we consider an analogous design problem: crafting a target distribution in Targeted Trajectory Distribution MDPs (TTD-MDPs). TTD-MDPs produce probabilistic policies that minimize divergence from a target distribution of trajectories from an underlying MDP. They are an extension of MDPs that provide variety of experience during repeated execution. Here, we present a brief overview of TTD-MDPs with approaches for constructing target distributions. Then we present a novel authorial idiom for creating target distributions using prototype trajectories. We evaluate these approaches on a drama manager for an interactive game.
Larger data sets and faster computers have enabled a recent flurry of progress--and investment--in artificial intelligence. David Cox of Harvard thinks the next big jump will depend on understanding what happens inside the head of a rat when it plays video games. Cox leads a 28 million project called Ariadne, funded by the U.S. Office of the Director of National Intelligence, that is looking for clues in mammalian brains to make software smarter. "This is a huge, moonshot-like effort to go into the brain and see what clues and tricks are hiding there for us to find," he said today at EmTech MIT 2016. Recent progress in tasks such as image recognition and translation sprang from putting more computing power behind a technique known as deep learning, which is loosely inspired by neuroscience.
Recurrent neural networks (RNNs) are an effective representation of control policies for a wide range of reinforcement and imitation learning problems. RNN policies, however, are particularly difficult to explain, understand, and analyze due to their use of continuous-valued memory vectors and observation features. In this paper, we introduce a new technique, Quantized Bottleneck Insertion, to learn finite representations of these vectors and features. The result is a quantized representation of the RNN that can be analyzed to improve our understanding of memory use and general behavior. We present results of this approach on synthetic environments and six Atari games. The resulting finite representations are surprisingly small in some cases, using as few as 3 discrete memory states and 10 observations for a perfect Pong policy. We also show that these finite policy representations lead to improved interpretability.
Google cut power usage in its data centers by several percentage points earlier this year by trusting artificially intelligent software derived from 1980s-era Atari video games. And in the years to come, the Internet giant not only could save much more electricity, but also solve far larger problems by taking on a much more complex video game. Research scientists at Google's DeepMind unit announced Friday they are developing a computer program that reads data about Blizzard Entertainment's "StarCraft II" games and learns how to play on its own. The software would have to figure out how to split its attention between micromanagement and long-term strategic decisions. It's that maneuvering that could deliver big breakthroughs.