AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

The Role of Bio-Inspired Modularity in General Learning

StClair, Rachel A., Hahn, William Edward, Barenholtz, Elan

arXiv.org Artificial IntelligenceSep-23-2021

One goal of general intelligence is to learn novel information without overwriting prior learning. The utility of learning without forgetting (CF) is twofold: first, the system can return to previously learned tasks after learning something new. In addition, bootstrapping previous knowledge may allow for faster learning of a novel task. Previous approaches to CF and bootstrapping are primarily based on modifying learning in the form of changing weights to tune the model to the current task, overwriting previously tuned weights from previous tasks. However, another critical factor that has been largely overlooked is the initial network topology, or architecture. Here, we argue that the topology of biological brains likely evolved certain features that are designed to achieve this kind of informational conservation. In particular, we consider that the highly conserved property of modularity may offer a solution to weight-update learning methods that adheres to the learning without catastrophic forgetting and bootstrapping constraints. Final considerations are then made on how to combine these two learning objectives in a dynamical, general learning system.

architecture, learning, modularity, (11 more...)

arXiv.org Artificial Intelligence

2109.15097

Country:

North America > United States > New York (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.46)

Add feedback

Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning

Alfano, Carlo, Rebeschini, Patrick

arXiv.org Machine LearningSep-23-2021

Cooperative multi-agent reinforcement learning is a decentralized paradigm in sequential decision making where agents distributed over a network iteratively collaborate with neighbors to maximize global (network-wide) notions of rewards. Exact computations typically involve a complexity that scales exponentially with the number of agents. To address this curse of dimensionality, we design a scalable algorithm based on the Natural Policy Gradient framework that uses local information and only requires agents to communicate with neighbors within a certain range. Under standard assumptions on the spatial decay of correlations for the transition dynamics of the underlying Markov process and the localized learning policy, we show that our algorithm converges to the globally optimal policy with a dimension-free statistical and computational complexity, incurring a localization error that does not depend on the number of agents and converges to zero exponentially fast as a function of the range of communication.

agent, assumption, complexity, (17 more...)

arXiv.org Machine Learning

2109.11692

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning

Kuba, Jakub Grudzien, Chen, Ruiqing, Wen, Munning, Wen, Ying, Sun, Fanglei, Wang, Jun, Yang, Yaodong

arXiv.org Artificial IntelligenceSep-23-2021

Trust region methods rigorously enabled reinforcement learning (RL) agents to learn monotonically improving policies, leading to superior performance on a variety of tasks. Unfortunately, when it comes to multi-agent reinforcement learning (MARL), the property of monotonic improvement may not simply apply; this is because agents, even in cooperative games, could have conflicting directions of policy updates. As a result, achieving a guaranteed improvement on the joint policy where each agent acts individually remains an open challenge. In this paper, we extend the theory of trust region learning to MARL. Central to our findings are the multi-agent advantage decomposition lemma and the sequential policy update scheme. Based on these, we develop Heterogeneous-Agent Trust Region Policy Optimisation (HATPRO) and Heterogeneous-Agent Proximal Policy Optimisation (HAPPO) algorithms. Unlike many existing MARL algorithms, HATRPO/HAPPO do not need agents to share parameters, nor do they need any restrictive assumptions on decomposibility of the joint value function. Most importantly, we justify in theory the monotonic improvement property of HATRPO/HAPPO. We evaluate the proposed methods on a series of Multi-Agent MuJoCo and StarCraftII tasks. Results show that HATRPO and HAPPO significantly outperform strong baselines such as IPPO, MAPPO and MADDPG on all tested tasks, therefore establishing a new state of the art.

agent, algorithm 1, joint policy, (14 more...)

arXiv.org Artificial Intelligence

2109.11251

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.54)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Individual and Collective Autonomous Development

Lippi, Marco, Mariani, Stefano, Martinelli, Matteo, Zambonelli, Franco

arXiv.org Artificial IntelligenceSep-23-2021

The increasing complexity and unpredictability of many ICT scenarios let us envision that future systems will have to dynamically learn how to act and adapt to face evolving situations with little or no a priori knowledge, both at the level of individual components and at the collective level. In other words, such systems should become able to autonomously develop models of themselves and of their environment. Autonomous development includes: learning models of own capabilities; learning how to act purposefully towards the achievement of specific goals; and learning how to act collectively, i.e., accounting for the presence of others. In this paper, we introduce the vision of autonomous development in ICT systems, by framing its key concepts and by illustrating suitable application domains. Then, we overview the many research areas that are contributing or can potentially contribute to the realization of the vision, and identify some key research challenges.

agent, autonomous development, learning, (15 more...)

arXiv.org Artificial Intelligence

2109.11223

Country: North America > United States > Hawaii (0.04)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Leisure & Entertainment > Games (0.94)
Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

BCS Lovelace Lecture 2021

Oxford Comp SciSep-22-2021, 09:10:02 GMT

In this talk I will review the development of commercially successful Knowledge Representation and Reasoning (KRR) systems and their genesis in foundational research. I will trace the evolution of KRR systems from logical and algorithmic foundations, through academic prototypes and standardisation to robust and scalable systems that power applications in areas as diverse as search, healthcare, financial services and manufacturing. I will discuss the barriers and milestones encountered along the journey, and lessons learned about the exploitation of research. Multi-agent systems first emerged as a research topic in the late 1980s. A key driver behind the emergence of the field was the idea of building systems that actively worked on behalf of human users in the pursuit of those users' goals.

bc lovelace lecture 2021, computer science, university, (11 more...)

Oxford Comp Sci

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.19)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.05)
Europe > Norway > Eastern Norway > Oslo (0.05)

Genre: Personal (0.53)

Industry: Government (0.75)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Amid Skepticism, Biden Vows a New Era of Global Collaboration

The New YorkerSep-22-2021, 00:37:21 GMT

Joe Biden made his début at the elegant green-marble rostrum of the United Nations this week, as the coronavirus infected more than half a million people each day worldwide, as wildfires and floods aggravated by climate change ravaged the Earth, and as the U.S. struggled to prevent a new cold war with China. In lofty language, the President tried to redirect the world's focus away from the calamitous end to America's longest war, in Afghanistan, and a recent bust-up with its most longstanding ally, France. Just eight months into his Presidency, Biden is already trying to hit reset on his foreign policy. "I stand here today for the first time in twenty years with the United States not at war. We've turned the page," Biden told the chamber.

biden, general assembly, washington, (14 more...)

The New Yorker

Country:

Europe > France (0.71)
Asia > Afghanistan (0.26)
Oceania > Australia (0.07)
(5 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Government > Foreign Policy (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)

Add feedback

Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Glazier, Arie, Loreggia, Andrea, Mattei, Nicholas, Rahgooy, Taher, Rossi, Francesca, Venable, K. Brent

arXiv.org Artificial IntelligenceSep-22-2021

Many real-life scenarios require humans to make difficult trade-offs: do we always follow all the traffic rules or do we violate the speed limit in an emergency? These scenarios force us to evaluate the trade-off between collective norms and our own personal objectives. To create effective AI-human teams, we must equip AI agents with a model of how humans make trade-offs in complex, constrained environments. These agents will be able to mirror human behavior or to draw human attention to situations where decision making could be improved. To this end, we propose a novel inverse reinforcement learning (IRL) method for learning implicit hard and soft constraints from demonstrations, enabling agents to quickly adapt to new settings. In addition, learning soft constraints over states, actions, and state features allows agents to transfer this knowledge to new domains that share similar aspects. We then use the constraint learning method to implement a novel system architecture that leverages a cognitive model of human decision making, multi-alternative decision field theory (MDFT), to orchestrate competing objectives. We evaluate the resulting agent on trajectory length, number of violated constraints, and total reward, demonstrating that our agent architecture is both general and achieves strong performance. Thus we are able to capture and replicate human-like trade-offs from demonstrations in environments when constraints are not explicit.

constraint, demonstration, mesc-irl, (15 more...)

arXiv.org Artificial Intelligence

2109.11018

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Florida > Escambia County > Pensacola (0.04)
Asia > Singapore (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Add feedback

Towards Multi-Agent Reinforcement Learning using Quantum Boltzmann Machines

Müller, Tobias, Roch, Christoph, Schmid, Kyrill, Altmann, Philipp

arXiv.org Artificial IntelligenceSep-22-2021

Reinforcement learning has driven impressive advances in machine learning. Simultaneously, quantum-enhanced machine learning algorithms using quantum annealing underlie heavy developments. Recently, a multi-agent reinforcement learning (MARL) architecture combining both paradigms has been proposed. This novel algorithm, which utilizes Quantum Boltzmann Machines (QBMs) for Q-value approximation has outperformed regular deep reinforcement learning in terms of time-steps needed to converge. However, this algorithm was restricted to single-agent and small 2x2 multi-agent grid domains. In this work, we propose an extension to the original concept in order to solve more challenging problems. Similar to classic DQNs, we add an experience replay buffer and use different networks for approximating the target and policy values. The experimental results show that learning becomes more stable and enables agents to find optimal policies in grid-domains with higher complexity. Additionally, we assess how parameter sharing influences the agents behavior in multi-agent domains. Quantum sampling proves to be a promising method for reinforcement learning tasks, but is currently limited by the QPU size and therefore by the size of the input and Boltzmann machine.

agent, architecture, reinforcement, (14 more...)

arXiv.org Artificial Intelligence

2109.109

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > Promising Solution (0.68)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

A Socially Aware Reinforcement Learning Agent for The Single Track Road Problem

Shapira, Ido, Azaria, Amos

arXiv.org Artificial IntelligenceSep-22-2021

We present the single track road problem. In this problem two agents face each-other at opposite positions of a road that can only have one agent pass at a time. We focus on the scenario in which one agent is human, while the other is an autonomous agent. We run experiments with human subjects in a simple grid domain, which simulates the single track road problem. We show that when data is limited, building an accurate human model is very challenging, and that a reinforcement learning agent, which is based on this data, does not perform well in practice. However, we show that an agent that tries to maximize a linear combination of the human's utility and its own utility, achieves a high score, and significantly outperforms other baselines, including an agent that tries to maximize only its own utility. While humans can cope with new situations quite easily, even state-of-the-art algorithms trouble with new situations that they haven't been trained on. Unfortunately, when it comes to autonomous vehicles the results may be devastating. One example for an uncommon, yet important scenario for autonomous vehicles is the problem of a single track road. In this problem two vehicles in opposite directions must cross a narrow road, which is not wide enough to allow both vehicles to pass at the same time.

agent, single track road problem, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2109.05486

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games (0.94)
Transportation > Ground > Road (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

Zohar, Roy, Mannor, Shie, Tennenholtz, Guy

arXiv.org Machine LearningSep-22-2021

Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. As environments grow in size, effective credit assignment becomes increasingly harder and often results in infeasible learning times. Still, in many real-world settings, there exist simplified underlying dynamics that can be leveraged for more scalable solutions. In this work, we exploit such locality structures effectively whilst maintaining global cooperation. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Centralized Training Decentralized Execution paradigm. Additionally, we provide a direct reward decomposition method for finding these local rewards when only a global signal is provided. We test our method empirically, showing it scales well compared to other methods, significantly improving performance and convergence speed.

assumption 2, decomposition, reward decomposition, (16 more...)

arXiv.org Machine Learning

2109.10632

Country:

North America > United States (0.06)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback