AITopics

1902.03273

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Lubars, Brian, Tan, Chenhao

Ask Not What AI Can Do, But What AI Should Do: Towards a Framework of Task Delegability

arXiv.org Artificial IntelligenceFeb-8-2019

Although artificial intelligence holds promise for addressing societal challenges, issues of exactly which tasks to automate and the extent to do so remain understudied. We approach the problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to artificial intelligence. We consider four high-level factors that can contribute to a delegation decision: motivation, difficulty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from academic papers, popular media portrayal of AI, and everyday life. For each task, we administer a survey to collect judgments of each factor and ask subjects to pick the extent to which they prefer AI involvement. We find little preference for full AI control and a strong preference for machine-in-the-loop designs, in which humans play the leading role. Our framework can effectively predict human preferences in degrees of AI assistance. Among the four factors, trust is the most predictive of human preferences of optimal human-machine delegation. This framework represents a first step towards characterizing human preferences of automation across tasks. We hope this work may encourage and aid in future efforts towards understanding such individual attitudes; our goal is to inform the public and the AI research community rather than dictating any direction in technology development.

artificial intelligence, human preference, machine learning, (16 more...)

1902.03245

Country: North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Plisnier, Hélène, Steckelmacher, Denis, Roijers, Diederik M., Nowé, Ann

The Actor-Advisor: Policy Gradient With Off-Policy Advice

Actor-critic algorithms learn an explicit policy (actor), and an accompanying value function (critic). The actor performs actions in the environment, while the critic evaluates the actor's current policy. However, despite their stability and promising convergence properties, current actor-critic algorithms do not outperform critic-only ones in practice. We believe that the fact that the critic learns Q^pi, instead of the optimal Q-function Q*, prevents state-of-the-art robust and sample-efficient off-policy learning algorithms from being used. In this paper, we propose an elegant solution, the Actor-Advisor architecture, in which a Policy Gradient actor learns from unbiased Monte-Carlo returns, while being shaped (or advised) by the Softmax policy arising from an off-policy critic. The critic can be learned independently from the actor, using any state-of-the-art algorithm. Being advised by a high-quality critic, the actor quickly and robustly learns the task, while its use of the Monte-Carlo return helps overcome any bias the critic may have. In addition to a new Actor-Critic formulation, the Actor-Advisor, a method that allows an external advisory policy to shape a Policy Gradient actor, can be applied to many other domains. By varying the source of advice, we demonstrate the wide applicability of the Actor-Advisor to three other important subfields of RL: safe RL with backup policies, efficient leverage of domain knowledge, and transfer learning in RL. Our experimental results demonstrate the benefits of the Actor-Advisor compared to state-of-the-art actor-critic methods, illustrate its applicability to the three other application scenarios listed above, and show that many important challenges of RL can now be solved using a single elegant solution.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

1902.02556

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Belgium > Flanders (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Stephenson, Matthew, Renz, Jochen

Agent-Based Adaptive Level Generation for Dynamic Difficulty Adjustment in Angry Birds

Section is a key area of investigation for video game research 2 describes the large amount of background and related (Hendrikx et al. 2013; Togelius et al. 2011). PLG work, both for Angry Birds and adaptive level generation in can be extremely useful for increasing a game's length and general. Section 3 presents our proposed adaptive generation replayability, as it allows a large number of levels to be created method. Section 4 describes our conducted experiments and in a relatively short time. It is also possible to tailor the results. Sections 5 discusses what these results could mean generated levels towards specific user's playstyles, known as for both human players and agents, Section 6 concludes this adaptive level generation, which allows for a unique and personalised work and outlines future possibilities.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

1902.02518

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Netherlands > Limburg > Maastricht (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.96)

Expressive mechanisms for equitable rent division on a budget

Velez, Rodrigo A.

We achieve four objectives: (1) each agent is allowed to make a report that expresses her preference about violating her budget constraint, a feature not achieved by mechanisms that only elicit quasi-linear reports; (2) these reports are finite dimensional; (3) computation is feasible in polynomial time; and (4) incentive properties of envy-free mechanisms that elicit quasi-linear reports are preserved.

allocation, artificial intelligence, game theory, (15 more...)

1902.02935

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
Asia > Japan > Honshū > Kansai > Wakayama Prefecture > Wakayama (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Game Theory (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

CESMA: Centralized Expert Supervises Multi-Agents

Lin, Alex Tong, Debord, Mark J., Estabridis, Katia, Hewer, Gary, Osher, Stanley

We consider the reinforcement learning problem of training multiple agents in order to maximize a shared reward. In this multi-agent system, each agent seeks to maximize the reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent observes a non-stationary and partially-observable environment. In order to resolve this issue, we demonstrate a novel multi-agent training framework that first turns a multi-agent problem into a single-agent problem to obtain a centralized expert that is then used to guide supervised learning for multiple independent agents with the goal of decentralizing the policy. We additionally demonstrate a way to turn the exponential growth in the joint action space into a linear growth for the centralized policy. Overall, the problem is twofold: the problem of obtaining a centralized expert, and then the problem of supervised learning to train the multi-agents. We demonstrate our solutions to both of these tasks, and show that supervised learning can be used to decentralize a multi-agent policy.

artificial intelligence, deep learning, machine learning, (13 more...)

1902.02311

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningFeb-7-2019

InfoBot: Transfer and Exploration via the Information Bottleneck

Goyal, Anirudh, Islam, Riashat, Strouse, Daniel, Ahmed, Zafarali, Botvinick, Matthew, Larochelle, Hugo, Bengio, Yoshua, Levine, Sergey

A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.

agent, decision state, information, (13 more...)

arXiv.org Machine Learning

1901.10902

Country:

North America > Canada > Quebec > Montreal (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Information Technology (0.46)
Education (0.46)
Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
(2 more...)

Czarnecki, Wojciech Marian, Pascanu, Razvan, Osindero, Simon, Jayakumar, Siddhant M., Swirszcz, Grzegorz, Jaderberg, Max

Distilling Policy Distillation

arXiv.org Machine LearningFeb-6-2019

The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning. This process, referred to as distillation, has been used to great success, for example, by enhancing the optimisation of agents, leading to stronger performance faster, on harder domains [26, 32, 5, 8]. Despite the widespread use and conceptual simplicity of distillation, many different formulations are used in practice, and the subtle variations between them can often drastically change the performance and the resulting objective that is being optimised. In this work, we rigorously explore the entire landscape of policy distillation, comparing the motivations and strengths of each variant through theoretical and empirical analysis. Our results point to three distillation techniques, that are preferred depending on specifics of the task. Specifically a newly proposed expected entropy regularised distillation allows for quicker learning in a wide range of situations, while still guaranteeing convergence.

arxiv preprint arxiv, distillation, gradient vector field, (11 more...)

arXiv.org Machine Learning

1902.02186

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Nguyen, Thanh Thi, Nguyen, Ngoc Duy, Nahavandi, Saeid

Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

arXiv.org Machine LearningFeb-6-2019

Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms however have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This paper addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multi-agent deep RL (MADRL) is presented, including non-stationarity, partial observability, continuous state and action spaces, multi-agent training schemes, multi-agent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed, with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to future development of more robust and highly useful multi-agent learning methods for solving real-world problems.

agent, learning, reinforcement learning, (11 more...)

arXiv.org Machine Learning

1812.11794

Country:

North America > United States > Montana (0.04)
Oceania > Australia > Victoria (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Energy (1.00)
Transportation (0.93)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Pustejovsky, James, Krishnaswamy, Nikhil

Situational Grounding within Multimodal Simulations

arXiv.org Artificial IntelligenceFeb-5-2019

In this paper, we argue that simulation platforms enable a novel type of embodied spatial reasoning, one facilitated by a formal model of object and event semantics that renders the continuous quantitative search space of an open-world, real-time environment tractable. We provide examples for how a semantically-informed AI system can exploit the precise, numerical information provided by a game engine to perform qualitative reasoning about objects and events, facilitate learning novel concepts from data, and communicate with a human to improve its models and demonstrate its understanding. We argue that simulation environments, and game engines in particular, bring together many different notions of "simulation" and many different technologies to provide a highly-effective platform for developing both AI systems and tools to experiment in both machine and human intelligence.

artificial intelligence, machine learning, simulation, (20 more...)

1902.01886

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.48)
(2 more...)