AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

Learning to Assist Agents by Observing Them

Keurulainen, Antti, Westerlund, Isak, Kaski, Samuel, Ilin, Alexander

arXiv.org Artificial IntelligenceOct-4-2021

The ability of an AI agent to assist other agents, such as humans, is an important and challenging goal, which requires the assisting agent to reason about the behavior and infer the goals of the assisted agent. Training such an ability by using reinforcement learning usually requires large amounts of online training, which is difficult and costly. On the other hand, offline data about the behavior of the assisted agent might be available, but is non-trivial to take advantage of by methods such as offline reinforcement learning. We introduce methods where the capability to create a representation of the behavior is first pre-trained with offline data, after which only a small amount of interaction data is needed to learn an assisting policy. We test the setting in a gridworld where the helper agent has the capability to manipulate the environment of the assisted artificial agents, and introduce three different scenarios where the assistance considerably improves the performance of the assisted agents.

agent, goal-driven agent, helper agent, (14 more...)

arXiv.org Artificial Intelligence

2110.01311

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.54)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning with Shapley Values

Heuillet, Alexandre, Couthouis, Fabien, Díaz-Rodríguez, Natalia

arXiv.org Artificial IntelligenceOct-4-2021

While Explainable Artificial Intelligence (XAI) is increasingly expanding more areas of application, little has been applied to make deep Reinforcement Learning (RL) more comprehensible. As RL becomes ubiquitous and used in critical and general public applications, it is essential to develop methods that make it better understood and more interpretable. This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values, a game theory concept used in XAI that successfully explains the rationale behind decisions taken by Machine Learning algorithms. Through testing common assumptions of this technique in two cooperation-centered socially challenging multi-agent environments environments, this article argues that Shapley values are a pertinent way to evaluate the contribution of players in a cooperative multi-agent RL context. To palliate the high overhead of this method, Shapley values are approximated using Monte Carlo sampling. Experimental results on Multiagent Particle and Sequential Social Dilemmas show that Shapley values succeed at estimating the contribution of each agent. These results could have implications that go beyond games in economics, (e.g., for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints). They also expose how Shapley values only give general explanations about a model and cannot explain a single run, episode nor justify precise actions taken by agents. Future work should focus on addressing these critical aspects.

agent, contribution, shapley value, (12 more...)

arXiv.org Artificial Intelligence

2110.01307

Country:

Europe > France (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback

An Unsupervised Video Game Playstyle Metric via State Discretization

Lin, Chiu-Chou, Chiu, Wei-Chen, Wu, I-Chen

arXiv.org Artificial IntelligenceOct-3-2021

On playing video games, different players usually have their own playstyles. Recently, there have been great improvements for the video game AIs on the playing strength. However, past researches for analyzing the behaviors of players still used heuristic rules or the behavior features with the game-environment support, thus being exhausted for the developers to define the features of discriminating various playstyles. In this paper, we propose the first metric for video game playstyles directly from the game observations and actions, without any prior specification on the playstyle in the target game. Our proposed method is built upon a novel scheme of learning discrete representations that can map game observations into latent discrete states, such that playstyles can be exhibited from these discrete states. Namely, we measure the playstyle distance based on game observations aligned to the same states. We demonstrate high playstyle accuracy of our metric in experiments on some video game platforms, including TORCS, RGSK, and seven Atari games, and for different agents including rule-based AI bots, learning-based AI bots, and human players.

experiment, playstyle, playstyle distance, (15 more...)

arXiv.org Artificial Intelligence

2110.0095

Country: Asia > Taiwan (0.05)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Games (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
(2 more...)

Add feedback

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization

Li, Boyue, Li, Zhize, Chi, Yuejie

arXiv.org Machine LearningOct-3-2021

Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.

algorithm, complexity, optimization, (15 more...)

arXiv.org Machine Learning

2110.01165

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Saudi Arabia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)

Add feedback

Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

Bıyık, Erdem, Lalitha, Anusha, Saha, Rajarshi, Goldsmith, Andrea, Sadigh, Dorsa

arXiv.org Machine LearningOct-2-2021

When humans collaborate with each other, they often make decisions by observing others and considering the consequences that their actions may have on the entire team, instead of greedily doing what is best for just themselves. We would like our AI agents to effectively collaborate in a similar way by capturing a model of their partners. In this work, we propose and analyze a decentralized Multi-Armed Bandit (MAB) problem with coupled rewards as an abstraction of more general multi-agent collaboration. We demonstrate that na\"ive extensions of single-agent optimal MAB algorithms fail when applied for decentralized bandit teams. Instead, we propose a Partner-Aware strategy for joint sequential decision-making that extends the well-known single-agent Upper Confidence Bound algorithm. We analytically show that our proposed strategy achieves logarithmic regret, and provide extensive experiments involving human-AI and human-robot collaboration to validate our theoretical findings. Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.

agent, algorithm, partner-aware ucb, (16 more...)

arXiv.org Machine Learning

2110.00751

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.75)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Pure Nash Equilibria in Resource Graph Games

Harks, Tobias (Augsburg University) | Klimm, Max | Matuschke, Jannik (KU Leuven)

Journal of Artificial Intelligence ResearchOct-1-2021

This paper studies the existence of pure Nash equilibria in resource graph games, a general class of strategic games succinctly representing the players’ private costs. These games are defined relative to a finite set of resources and the strategy set of each player corresponds to a set of subsets of resources. The cost of a resource is an arbitrary function of the load vector of a certain subset of resources. As our main result, we give complete characterizations of the cost functions guaranteeing the existence of pure Nash equilibria for weighted and unweighted players, respectively. For unweighted players, pure Nash equilibria are guaranteed to exist for any choice of the players’ strategy space if and only if the cost of each resource is an arbitrary function of the load of the resource itself and linear in the load of all other resources where the linear coefficients of mutual influence of different resources are symmetric. This implies in particular that for any other cost structure there is a resource graph game that does not have a pure Nash equilibrium. For weighted games where players have intrinsic weights and the cost of each resource depends on the aggregated weight of its users, pure Nash equilibria are guaranteed to exist if and only if the cost of a resource is linear in all resource loads, and the linear factors of mutual influence are symmetric, or there is no interaction among resources and the cost is an exponential function of the local resource load. We further discuss the computational complexity of pure Nash equilibria in resource graph games showing that for unweighted games where pure Nash equilibria are guaranteed to exist, it is coNP-complete to decide for a given strategy profile whether it is a pure Nash equilibrium. For general resource graph games, we prove that the decision whether a pure Nash equilibrium exists is Σ p 2 -complete.

graph game, pure nash equilibrium, resource graph game, (10 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12668

AI Access Foundation

12668

Journal of Artificial Intelligence Research

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.34)

Industry:

Leisure & Entertainment > Games (0.48)
Transportation (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Divergence-Regularized Multi-Agent Actor-Critic

Su, Kefan, Lu, Zongqing

arXiv.org Artificial IntelligenceOct-1-2021

Entropy regularization is a popular method in reinforcement learning (RL). Although it has many advantages, it alters the RL objective and makes the converged policy deviate from the optimal policy of the original Markov Decision Process. Though divergence regularization has been proposed to settle this problem, it cannot be trivially applied to cooperative multi-agent reinforcement learning (MARL). In this paper, we investigate divergence regularization in cooperative MARL and propose a novel off-policy cooperative MARL framework, divergence-regularized multi-agent actor-critic (DMAC). Mathematically, we derive the update rule of DMAC which is naturally off-policy, guarantees a monotonic policy improvement and is not biased by the regularization. DMAC is a flexible framework and can be combined with many existing MARL algorithms. We evaluate DMAC in a didactic stochastic game and StarCraft Multi-Agent Challenge and empirically show that DMAC substantially improves the performance of existing MARL algorithms.

dmac, international conference, regularization, (14 more...)

arXiv.org Artificial Intelligence

2110.00304

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Gradient Update #9: Bias Bounties and Hierarchical Architectures for Computer Vision

#artificialintelligenceSep-30-2021, 01:25:58 GMT

Welcome to the ninth update from the Gradient! If you were referred by a friend, subscribe and follow us on Twitter! This news edition's story is Sharing learnings from the first algorithmic bias bounty challenge. Summary Twitter's algorithmic bias bounty challenge, the first of its kind, recently concluded. While users had previously found the algorithm had a racial bias, the bounty uncovered a number of other biases and potential harms.

algorithm, bias bounty and hierarchical architecture, twitter, (8 more...)

#artificialintelligence

Country:

North America > United States > Oregon (0.05)
North America > United States > Colorado > Boulder County > Boulder (0.05)
Asia > China (0.05)

Industry:

Information Technology > Security & Privacy (0.70)
Information Technology > Services (0.70)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.48)
(2 more...)

Add feedback

A Privacy-preserving Distributed Training Framework for Cooperative Multi-agent Deep Reinforcement Learning

Shi, Yimin

arXiv.org Artificial IntelligenceSep-30-2021

Deep Reinforcement Learning (DRL) sometimes needs a large amount of data to converge in the training procedure and in some cases, each action of the agent may produce regret. This barrier naturally motivates different data sets or environment owners to cooperate to share their knowledge and train their agents more efficiently. However, it raises privacy concerns if we directly merge the raw data from different owners. To solve this problem, we proposed a new Deep Neural Network (DNN) architecture with both global NN and local NN, and a distributed training framework. We allow the global weights to be updated by all the collaborator agents while the local weights are only updated by the agent they belong to. In this way, we hope the global weighs can share the common knowledge among these collaborators while the local NN can keep the specialized properties and ensure the agent to be compatible with its specific environment. Experiments show that the framework can efficiently help agents in the same or similar environments to collaborate in their training process and gain a higher convergence rate and better performance.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2109.14998

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Implementation of Parallel Simplified Swarm Optimization in CUDA

Yeh, Wei-Chang, Liu, Zhenyao, Tan, Shi-Yi, Huang, Shang-Ke

arXiv.org Artificial IntelligenceSep-30-2021

As the acquisition cost of the graphics processing unit (GPU) has decreased, personal computers (PC) can handle optimization problems nowadays. In optimization computing, intelligent swarm algorithms (SIAs) method is suitable for parallelization. However, a GPU-based Simplified Swarm Optimization Algorithm has never been proposed. Accordingly, this paper proposed Parallel Simplified Swarm Optimization (PSSO) based on the CUDA platform considering computational ability and versatility. In PSSO, the theoretical value of time complexity of fitness function is O (tNm). There are t iterations and N fitness functions, each of which required pair comparisons m times. pBests and gBest have the resource preemption when updating in previous studies. As the experiment results showed, the time complexity has successfully reduced by an order of magnitude of N, and the problem of resource preemption was avoided entirely.

algorithm, kernel function, psso, (14 more...)

arXiv.org Artificial Intelligence

2110.0147

Country:

North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Asia > Taiwan (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback