Goto

Collaborating Authors

 Agents


What's Next For Robotics: In The Field, Inferencing On The Edge - AI Trends

#artificialintelligence

Robots are a key application for AI and in addition to an excellent plenary talk by Julie Shah of MIT, a whole track was dedicated to AI in robotics applications. Dan Kara, VP of robotics and intelligent systems for WTWH Media, outlined some of the challenges in building robots--not chatbots, he clarified, but robots that act in the physical world. "It seems like every year it's just around the corner," he said, but this year the tailwinds are picking up. Robotics is the foundation for much of our work thus far in artificial intelligence and machine learning, Kara argued. "It's only been fairly recently that you've started getting artificial intelligence or machine learning moving off into different labs," he said.


Generating Justifications for Norm-Related Agent Decisions

arXiv.org Artificial Intelligence

W e present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. W e use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust.


An A.I. has beat humans at yet another of our own games

#artificialintelligence

Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged by consensus as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multiagent challenges. Over the course of a decade and numerous competitions 1–3, the best results have been made possible by hand-crafting major elements of the system, simplifying important aspects of the game, or using superhuman capabilities 4. Even with these modifications, no previous system has come close to rivalling the skill of top players in the full game. We chose to address the challenge of StarCraft using general purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counterstrategies, each represented by deep neural networks5,6. We evaluated our agent, AlphaStar, in the full game of StarCraft II, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.


Embodied Agent - an overview

#artificialintelligence

For an autonomous embodied agent acting in the real world (e.g., an animal, a human, or a robot), perceptual categorization--the ability to make distinctions--is a hard problem (Harnad, 2005). First, based on the stimulation impinging on its sensory arrays (sensation) the agent has to rapidly determine and attend to what needs to be categorized. Second, the appearance and properties of objects or events in the environment being classified fluctuate continuously, for example owing to occlusions, or changes of distances and orientations with respect to the agent. And third, the environmental conditions (e.g., illumination, viewpoint, and background noise) vary considerably. There is much relevant work in computer vision that has been devoted to extracting scale- and translation-invariant low-level visual features and high-level multidimensional representations for the purpose of robust perceptual categorization (Riesenhuber & Poggio, 2002).


PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning

arXiv.org Machine Learning

Single-agent deep reinforcement learning has achieved impressive performance in many domains, including playing Go [1, 2] and Atari games [3, 4]. However, many real world problems, such as traffic congestion reduction [5, 6], antenna tilt control [7], and dynamic resource allocation [8] are more naturally modeled as multi-agent systems. Unfortunately, directly deploying single-agent reinforcement learning to each agent in a multi-agent system does not result in satisfying performance [9, 10]. Particularly, in multi-agent reinforcement learning [8, 10-19], estimating the value function is challenging, because the environment is non-stationary from the perspective of an individual agent [10, 11]. To alleviate the issue, recently, multi-agent deep deterministic policy gradient (MADDPG) [10] proposed a centralized critic whose input is the concatenation of all agents' observations and actions.


Learning Fairness in Multi-Agent Systems

arXiv.org Artificial Intelligence

Fairness is essential for human society, contributing to stability and productivity. Similarly, fairness is also the key for many multi-agent systems. Taking fairness into multi-agent learning could help multi-agent systems become both efficient and stable. However, learning efficiency and fairness simultaneously is a complex, multi-objective, joint-policy optimization. To tackle these difficulties, we propose FEN, a novel hierarchical reinforcement learning model. We first decompose fairness for each agent and propose fair-efficient reward that each agent learns its own policy to optimize. To avoid multi-objective conflict, we design a hierarchy consisting of a controller and several sub-policies, where the controller maximizes the fair-efficient reward by switching among the sub-policies that provides diverse behaviors to interact with the environment. FEN can be trained in a fully decentralized way, making it easy to be deployed in real-world applications. Empirically, we show that FEN easily learns both fairness and efficiency and significantly outperforms baselines in a variety of multi-agent scenarios.


Linear Speedup in Saddle-Point Escape for Decentralized Non-Convex Optimization

arXiv.org Machine Learning

Under appropriate cooperation protocols and parameter choices, fully decentralized solutions for stochastic optimization have been shown to match the performance of centralized solutions and result in linear speedup (in the number of agents) relative to non-cooperative approaches in the strongly-convex setting. More recently, these results have been extended to the pursuit of first-order stationary points in non-convex environments. In this work, we examine in detail the dependence of second-order convergence guarantees on the spectral properties of the combination policy for non-convex multi agent optimization. We establish linear speedup in saddle-point escape time in the number of agents for symmetric combination policies and study the potential for further improvement by employing asymmetric combination weights. The results imply that a linear speedup can be expected in the pursuit of second-order stationary points, which exclude local maxima as well as strict saddle-points and correspond to local or even global minima in many important learning settings.


Network Classifiers With Output Smoothing

arXiv.org Artificial Intelligence

This work introduces two strategies for training network classifiers with heterogeneous agents. One strategy promotes global smoothing over the graph and a second strategy promotes local smoothing over neighbourhoods. It is assumed that the feature sizes can vary from one agent to another, with some agents observing insufficient attributes to be able to make reliable decisions on their own. As a result, cooperation with neighbours is necessary. However, due to the fact that the feature dimensions are different across the agents, their classifier dimensions will also be different. This means that cooperation cannot rely on combining the classifier parameters. We instead propose smoothing the outputs of the classifiers, which are the predicted labels. By doing so, the dynamics that describes the evolution of the network classifier becomes more challenging than usual because the classifier parameters end up appearing as part of the regularization term as well. We illustrate performance by means of computer simulations.


Towards A Logical Account of Epistemic Causality

arXiv.org Artificial Intelligence

Reasoning about observed effects and their causes is important in multi-agent contexts. While there has been much work on causality from an objective standpoint, causality from the point of view of some particular agent has received much less attention. In this paper, we address this issue by incorporating an epistemic dimension to an existing formal model of causality. We define what it means for an agent to know the causes of an effect. Then using a counterexample, we prove that epistemic causality is a different notion from its objective counterpart. 1 Introduction Research on actual causality involves finding in a given narrative (trace) the event that caused an effect. Pearl [25, 26] was a pioneer to lead a computational enquiry in actual causality. The research was later continued by Halpern and Pearl [12, 15] and others [8, 17, 18, 13, 14]. Unfortunately, as argued by Glymour et al. [9], most of these accounts are developed by analyzing a handful of simple examples, and then validated relative to our intuition for these examples, a process which G oßler et al. [11] referred to as TEGAR (i.e. As such, even after multiple revisions, these definitions continue to suffer from various conceptual problems such as the early preemption problem and the over-determination problem. For instance, despite claims to the contrary, the definitions given in [14] suffer from the problem of preemption, which occurs when two competing events try to achieve the same effect and the latter of these fails to do so as the earlier one has already achieved the effect (see [31] and [4] for a discussion). In an attempt to address these issues, Batusov and Soutchanski [2, 3] recently proposed a new definition of actual causality that is based on a well developed and expressive formalization of actions and change, namely the situation calculus [23, 27]. The definition is derived from first principles and does not follow a TEGAR scheme.


Acceptable Planning: Influencing Individual Behavior to Reduce Transportation Energy Expenditure of a City

Journal of Artificial Intelligence Research

Our research aims at developing intelligent systems to reduce the transportation-related energy expenditure of a large city by influencing individual behavior. We introduce Copter - an intelligent travel assistant that evaluates multi-modal travel alternatives to find a plan that is acceptable to a person given their context and preferences. We propose a formulation for acceptable planning that brings together ideas from AI, machine learning, and economics. This formulation has been incorporated in Copter that produces acceptable plans in real-time. We adopt a novel empirical evaluation framework that combines human decision data with a high fidelity multi-modal transportation simulation to demonstrate a 4% energy reduction and 20% delay reduction in a realistic deployment scenario in Los Angeles, California, USA. This article is part of the special track on AI and Society.