AITopics

1903.0411

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games > Computer Games (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

#artificialintelligenceMar-9-2019, 23:58:49 GMT

The Promise of Hierarchical Reinforcement Learning

This top-down planning approach decides what a good subgoal is before planning to achieve it." "For complex, high-dimensional Markov Decision Processes (MDPs), it may be necessary to represent the policy with function approximation. A problem is mis- specified whenever, the representation cannot express any policy with acceptable performance.

machine learning, reinforcement, reinforcement learning, (16 more...)

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report (0.68)
Overview (0.46)

Industry:

Education (0.94)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Fang, Kuan, Toshev, Alexander, Fei-Fei, Li, Savarese, Silvio

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks

arXiv.org Machine LearningMar-9-2019

Many robotic applications require the agent to perform long-horizon tasks in partially observable environments. In such applications, decision making at any step can depend on observations received far in the past. Hence, being able to properly memorize and utilize the long-term history is crucial. In this work, we propose a novel memory-based policy, named Scene Memory Transformer (SMT). The proposed policy embeds and adds each observation to a memory and uses the attention mechanism to exploit spatio-temporal dependencies. This model is generic and can be efficiently trained with reinforcement learning over long episodes. On a range of visual navigation tasks, SMT demonstrates superior performance to existing reactive and memory-based policies by a margin.

machine learning, natural language, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1903.03878

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.83)
(3 more...)

Milli, Smitha, Dragan, Anca D.

Literal or Pedagogic Human? Analyzing Human Model Misspecification in Objective Learning

arXiv.org Artificial IntelligenceMar-9-2019

It is incredibly easy for a system designer to misspecify the objective for an autonomous system ("robot''), thus motivating the desire to have the robot learn the objective from human behavior instead. Recent work has suggested that people have an interest in the robot performing well, and will thus behave pedagogically, choosing actions that are informative to the robot. In turn, robots benefit from interpreting the behavior by accounting for this pedagogy. In this work, we focus on misspecification: we argue that robots might not know whether people are being pedagogic or literal and that it is important to ask which assumption is safer to make. We cast objective learning into the more general form of a common-payoff game between the robot and human, and prove that in any such game literal interpretation is more robust to misspecification. Experiments with human data support our theoretical results and point to the sensitivity of the pedagogic assumption.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1903.03877

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

#artificialintelligenceMar-8-2019, 12:48:12 GMT

Now any business can access the same type of AI that powered AlphaGo

A startup called CogitAI has developed a platform that lets companies use reinforcement learning, the technique that gave AlphaGo mastery of the board game Go. Gaining experience: AlphaGo, an AI program developed by DeepMind, taught itself to play Go by practicing. It's practically impossible for a programmer to manually code in the best strategies for winning. Instead, reinforcement learning let the program figure out how to defeat the world's best human players on its own. Drug delivery: Reinforcement learning is still an experimental technology, but it is gaining a foothold in industry. Amazon recently launched a reinforcement-learning platform, but it is aimed more at researchers and academics.

machine learning, platform, reinforcement learning, (4 more...)

AI-Alerts: 2019 > 2019-03 > AAAI AI-Alert for Mar 12, 2019 (1.00)

Industry: Leisure & Entertainment > Games > Go (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Games > Go (0.88)

#artificialintelligenceMar-8-2019, 04:03:48 GMT

This new AI tool ages faces in videos with creepy accuracy

A new machine learning paper shows how AI can take footage of someone and duplicate the video with the subject looking an age the researchers specify. The team behind the paper, from the University of Arkansas, Clemson University, Carnegie Mellon University, and Concordia University in Canada, claim that this is one of the first methods to use AI to tackle aging in videos. The system was trained on an expanded dataset of photos of showing individuals at different ages. Reinforcement learning, a technique that rewards an AI model for getting a task correct, comes into play by rewarding the system when the synthesized features, like wrinkles, appear similarly across consecutive video frames. Similar approaches power the "deepfake" technology that has raised alarms about the prospect of AI-powered video propaganda.

new ai tool age face, university, video, (3 more...)

Country:

North America > United States > Arkansas (0.27)
North America > Canada (0.27)

Genre: Research Report (0.36)

Industry:

Information Technology (0.56)
Government (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

#artificialintelligenceMar-8-2019, 02:19:15 GMT

AI Startup Invents Trick For Robots To More Efficiently Teach Themselves Complex Tasks

Google-owned DeepMind uses sophisticated computer simulations for computers to teach themselves how to accomplish certain tasks. The simulated training, known as reinforcement learning, involves the computer trying out thousands (or millions) of different things until it manages to figure out what to do. Using this approach combined with deep learning, the London-based artificial intelligence research unit is teaching computers how to beat the world's best Go players and training robots how to move around in the world. A tiny Berkeley, California-based AI startup, Bonsai, has invented a trick to beat DeepMind in this game. The trick -- the company is calling it "concept networks" -- massively increases the efficiency of reinforcement learning.

bonsai, machine learning, reinforcement learning, (14 more...)

Country: North America > United States > California > Alameda County > Berkeley (0.26)

Industry: Leisure & Entertainment > Games (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)

arXiv.org Artificial IntelligenceMar-8-2019

Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

Pong, Vitchyr H., Dalal, Murtaza, Lin, Steven, Nair, Ashvin, Bahl, Shikhar, Levine, Sergey

In standard reinforcement learning, each new skill requires a manually-designed reward function, which takes considerable manual effort and engineering. Self-supervised goal setting has the potential to automate this process, enabling an agent to propose its own goals and acquire skills that achieve these goals. However, such methods typically rely on manually-designed goal distributions, or heuristics to force the agent to explore a wide range of states. We propose a formal exploration objective for goal-reaching policies that maximizes state coverage. We show that this objective is equivalent to maximizing the entropy of the goal distribution together with goal reaching performance, where goals correspond to entire states. We present an algorithm called Skew-Fit for learning such a maximum-entropy goal distribution, and show that under certain regularity conditions, our method converges to a uniform distribution over the set of possible states, even when we do not know this set beforehand. Skew-Fit enables self-supervised agents to autonomously choose and practice diverse goals. Our experiments show that it can learn a variety of manipulation tasks from images, including opening a door with a real robot, entirely from scratch and without any manually-designed reward function.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1903.03698

Country: North America > United States > California (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Xu, Ruiyang, Lieberherr, Karl

Learning Self-Game-Play Agents for Combinatorial Optimization Problems

arXiv.org Artificial IntelligenceMar-8-2019

Recent progress in reinforcement learning (RL) using self-game-play has shown remarkable performance on several board games (e.g., Chess and Go) as well as video games (e.g., Atari games and Dota2). It is plausible to consider that RL, starting from zero knowledge, might be able to gradually approximate a winning strategy after a certain amount of training. In this paper, we explore neural Monte-Carlo-Tree-Search (neural MCTS), an RL algorithm which has been applied successfully by DeepMind to play Go and Chess at a super-human level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification (ZG) to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problem. The ZG also provides a specially designed neural MCTS. We use a combinatorial planning problem for which the ground-truth policy is efficiently computable to demonstrate that ZG is promising.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1903.03674

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

arXiv.org Artificial IntelligenceMar-8-2019

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

Mittal, Akash, Dhawan, Anuj, Medya, Sourav, Ranu, Sayan, Singh, Ambuj

In this paper, we propose a deep reinforcement learning framework called GCOMB to learn algorithms that can solve combinatorial problems over large graphs. GCOMB mimics the greedy algorithm in the original problem and incrementally constructs a solution. The proposed framework utilizes Graph Convolutional Network (GCN) to generate node embeddings that predicts the potential nodes in the solution set from the entire node set. These embeddings enable an efficient training process to learn the greedy policy via Q-learning. Through extensive evaluation on several real and synthetic datasets containing up to a million nodes, we establish that GCOMB is up to 41% better than the state of the art, up to seven times faster than the greedy algorithm, robust and scalable to large dynamic networks.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

1903.03332

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)