Goto

Collaborating Authors

 Agents


overnewser, The best real-time news sites information.

#artificialintelligence

In this contributed article, Sharmistha Sarkar of India based Progressive Markets, highlights a handful of compelling technology advancements that are helping to drive the evolution of artificial intelligence. Industry is expected to grow at a CAGR of 46.5% from 2017 to 2025. The market is growing fast due to improved productivity through AI, its diversified application areas, and big data integration drive....


Counterfactual Multi-Agent Policy Gradients

arXiv.org Artificial Intelligence

Cooperative multi-agent systems can be naturally used to model many real world problems, such as network packet routing and the coordination of autonomous vehicles. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent's action, while keeping the other agents' actions fixed. COMA also uses a critic representation that allows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor-critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.


Intelligent Agents: An A.I. View of Optimization

#artificialintelligence

As a digital analyst or marketer, you know the importance of analytical decision making. Go to any industry conference, blog, meet up, or even just read the popular press, and you will hear and see topics like machine learning, artificial intelligence, and predictive analytics everywhere. Because many of us don't come from a technical/statistical background, this can be both a little confusing and intimidating. But don't sweat it, in this post, I will try to clear up a some of this confusion by introducing a simple, yet powerful framework – the intelligent agent – which will help link these new ideas with familiar tools and concepts like A/B Testing and Optimization. Note: the intelligent agent framework is used as the guiding principle in Russell and Norvig's excellent text Artificial Intelligence: A Modern Approach – it's an awesome book, and I recommend anyone who wants to learn more to go get a copy or check out their online AI course.


Using Machine Learning Agents in a real game: a beginner's guide – Unity Blog

#artificialintelligence

My name is Alessia Nigretti and I am a Technical Evangelist for Unity. My job is to introduce Unity's new features to developers. My fellow evangelist Ciro Continisio and I developed the first demo game that uses the new Unity Machine Learning Agents system and showed it at DevGamm Minsk 2017. This post is based on our talk and explains what we learned making the demo. At the same time, we invite you to join the ML-Agents Challenge and show off your creative use-cases of the toolkit.


Multi-focus Attention Network for Efficient Deep Reinforcement Learning

arXiv.org Machine Learning

Deep reinforcement learning (DRL) has shown incredible performance in learning various tasks to the human level. However, unlike human perception, current DRL models connect the entire low-level sensory input to the state-action values rather than exploiting the relationship between and among entities that constitute the sensory input. Because of this difference, DRL needs vast amount of experience samples to learn. In this paper, we propose a Multi-focus Attention Network (MANet) which mimics human ability to spatially abstract the low-level sensory input into multiple entities and attend to them simultaneously. The proposed method first divides the low-level input into several segments which we refer to as partial states. After this segmentation, parallel attention layers attend to the partial states relevant to solving the task. Our model estimates state-action values using these attended partial states. In our experiments, MANet attains highest scores with significantly less experience samples. Additionally, the model shows higher performance compared to the Deep Q-network and the single attention model as benchmarks. Furthermore, we extend our model to attentive communication model for performing multi-agent cooperative tasks. In multi-agent cooperative task experiments, our model shows 20% faster learning than existing state-of-the-art model.


Explicablility as Minimizing Distance from Expected Behavior

arXiv.org Artificial Intelligence

In order to have effective human AI collaboration, it is not simply enough to address the question of autonomy; an equally important question is, how the AI's behavior is being perceived by their human counterparts. When AI agent's task plans are generated without such considerations, they may often demonstrate inexplicable behavior from the human's point of view. This problem arises due to the human's partial or inaccurate understanding of the agent's planning process and/or the model. This may have serious implications on human-AI collaboration, from increased cognitive load and reduced trust in the agent, to more serious concerns of safety in interactions with physical agent. In this paper, we address this issue by modeling the notion of plan explicability as a function of the distance between a plan that agent makes and the plan that human expects it to make. To this end, we learn a distance function based on different plan distance measures that can accurately model this notion of plan explicability, and develop an anytime search algorithm that can use this distance as a heuristic to come up with progressively explicable plans. We evaluate the effectiveness of our approach in a simulated autonomous car domain and a physical service robot domain. We provide empirical evaluations that demonstrate the usefulness of our approach in making the planning process of an autonomous agent conform to human expectations.


Modelling collective motion based on the principle of agency

arXiv.org Machine Learning

Collective motion is an intriguing phenomenon, especially considering that it arises from a set of simple rules governing local interactions between individuals. In theoretical models, these rules are normally \emph{assumed} to take a particular form, possibly constrained by heuristic arguments. We propose a new class of models, which describe the individuals as \emph{agents}, capable of deciding for themselves how to act and learning from their experiences. The local interaction rules do not need to be postulated in this model, since they \emph{emerge} from the learning process. We apply this ansatz to a concrete scenario involving marching locusts, in order to model the phenomenon of density-dependent alignment. We show that our learning agent-based model can account for a Fokker-Planck equation that describes the collective motion and, most notably, that the agents can learn the appropriate local interactions, requiring no strong previous assumptions on their form. These results suggest that learning agent-based models are a powerful tool for studying a broader class of problems involving collective motion and animal agency in general.


Watson Virtual Agent Go viral for the right reasons

#artificialintelligence

Want to watch this again later? Sign in to add this video to a playlist. Report Need to report the video? Sign in to report inappropriate content. Report Need to report the video?


Diff-DAC: Distributed Actor-Critic for Multitask Deep Reinforcement Learning

arXiv.org Machine Learning

We propose a multiagent distributed actor-critic algorithm for multitask reinforcement learning (MRL), named Diff-DAC. The agents are connected, forming a (possibly sparse) network. Each agent is assigned a task and has access to data from this local task only. During the learning process, the agents are able to communicate some parameters to their neighbors. Since the agents incorporate their neighbors' parameters into their own learning rules, the information is diffused across the network, and they can learn a common policy that generalizes well across all tasks. Diff-DAC is scalable since the computational complexity and communication overhead per agent grow with the number of neighbors, rather than with the total number of agents. Moreover, the algorithm is fully distributed in the sense that agents self-organize, with no need for coordinator node. Diff-DAC follows an actor-critic scheme where the value function and the policy are approximated with deep neural networks, being able to learn expressive policies from raw data. As a by-product of Diff-DAC's derivation from duality theory, we provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method to approximate the solution of a linear program. Experiments illustrate the performance of the algorithm in the cart-pole, inverted pendulum, and swing-up cart-pole environments.


Multi-Organ Exchange

Journal of Artificial Intelligence Research

Kidney exchange, where candidates with organ failure trade incompatible but willing donors, is a life-saving alternative to the deceased donor waitlist, which has inadequate supply to meet demand. While fielded kidney exchanges see huge benefit from altruistic kidney donors (who give an organ without a paired needy candidate), a significantly higher medical risk to the donor deters similar altruism with livers. In this paper, we begin by exploring the idea of large-scale liver exchange, and show on demographically accurate data that vetted kidney exchange algorithms can be adapted to clear such an exchange at the nationwide level. We then propose cross-organ donation where kidneys and livers can be bartered for each other. We show theoretically that this multi-organ exchange provides linearly more transplants than running separate kidney and liver exchanges. This linear gain is a product of altruistic kidney donors creating chains that thread through the liver pool; it exists even when only a small but constant portion of the donors on the kidney side of the pool are willing to donate a liver lobe. We support this result experimentally on demographically accurate multi-organ exchanges. We conclude with thoughts regarding the fielding of a nationwide liver or joint liver-kidney exchange from a legal and computational point of view.