AITopics

doi: 10.1109/AITEST49225.2020.00011

2103.04364

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Tian, Ran, Tomizuka, Masayoshi, Sun, Liting

Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data

arXiv.org Artificial IntelligenceMar-7-2021

Reward function, as an incentive representation that recognizes humans' agency and rationalizes humans' actions, is particularly appealing for modeling human behavior in human-robot interaction. Inverse Reinforcement Learning is an effective way to retrieve reward functions from demonstrations. However, it has always been challenging when applying it to multi-agent settings since the mutual influence between agents has to be appropriately modeled. To tackle this challenge, previous work either exploits equilibrium solution concepts by assuming humans as perfectly rational optimizers with unbounded intelligence or pre-assigns humans' interaction strategies a priori. In this work, we advocate that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process, and such an inherent and latent characteristic should be accounted for in reward learning algorithms. Hence, we exploit such insights from Theory-of-Mind and propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning. We validate our approach in both zero-sum and general-sum games with synthetic agents and illustrate a practical application to learning human drivers' reward functions from real driving data. We compare our approach with two baseline algorithms. The results show that by reasoning about humans' latent intelligence levels, the proposed approach has more flexibility and capability to retrieve reward functions that explain humans' driving behaviors better.

agent, intelligence level, reward function, (15 more...)

2103.04289

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Computer Games (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Awad, Abubakr, Pang, Wei, Lusseau, David, Coghill, George M.

A Survey on Physarum Polycephalum Intelligent Foraging Behaviour and Bio-Inspired Applications

arXiv.org Artificial IntelligenceMar-7-2021

Bio-inspired computing focuses on extracting computational models for problem solving from in-depth understanding of behaviour and mechanisms of biological systems. In recent years, cellular computational models based on the structure and the processes of living cells, such as bacterial colonies [43] and viral models [23] have become an important line of research in bio-inspired computing. Physarum-computing, as an example of cellular computing model, has attracted the attention of many researchers [84]. Physarum polycephalum (Physarum for short) is an example of plasmodial slime moulds that are classified as a fungus "Myxomycetes" [21]. In recent years, research on Physarum-inspired computing has become more popular since Nakagaki et al. (2000) performed their well-known experiments showing that Physarum was able to find the shortest route through a maze [57]. Recent research has confirmed the ability of Physarum-inspired algorithms to solve a wide range of problems [103, 78]. Physarum can be modelled as a reaction-diffusion system (cytoplasmic liquid) encapsulated in an elastic growing membrane of actin-myosin cytoskeleton [2].

algorithm, physarum, physarum polycephalum, (15 more...)

2103.00172

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Europe > Poland > Subcarpathia Province > Rzeszów (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Rail (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Machine LearningMar-6-2021

Linear Regression over Networks with Communication Guarantees

Gatsis, Konstantinos

A key functionality of emerging connected autonomous systems such as smart cities, smart transportation systems, and the industrial Internet-of-Things, is the ability to process and learn from data collected at different physical locations. This is increasingly attracting attention under the terms of distributed learning and federated learning. However, in connected autonomous systems, data transfer takes place over communication networks with often limited resources. This paper examines algorithms for communication-efficient learning for linear regression tasks by exploiting the informativeness of the data. The developed algorithms enable a tradeoff between communication and learning with theoretical performance guarantees and efficient practical implementations.

agent, learning, performance gain, (14 more...)

arXiv.org Machine Learning

2103.0414

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Smart Houses & Appliances (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)

Off-Belief Learning

Hu, Hengyuan, Lerer, Adam, Cui, Brandon, Pineda, Luis, Wu, David, Brown, Noam, Foerster, Jakob

The standard problem setting in Dec-POMDPs is self-play, where the goal is to find a set of policies that play optimally together. Policies learned through self-play may adopt arbitrary conventions and rely on multi-step counterfactual reasoning based on assumptions about other agents' actions and thus fail when paired with humans or independently trained agents. In contrast, no current methods can learn optimal policies that are fully grounded, i.e., do not rely on counterfactual information from observing other agents' actions. To address this, we present off-belief learning} (OBL): at each time step OBL agents assume that all past actions were taken by a given, fixed policy ($\pi_0$), but that future actions will be taken by an optimal policy under these same assumptions. When $\pi_0$ is uniform random, OBL learns the optimal grounded policy. OBL can be iterated in a hierarchy, where the optimal policy from one level becomes the input to the next. This introduces counterfactual reasoning in a controlled manner. Unlike independent RL which may converge to any equilibrium policy, OBL converges to a unique policy, making it more suitable for zero-shot coordination. OBL can be scaled to high-dimensional settings with a fictitious transition mechanism and shows strong performance in both a simple toy-setting and the benchmark human-AI/zero-shot coordination problem Hanabi.

agent, obl, reasoning, (16 more...)

2103.04

Country:

North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Causal Analysis of Agent Behavior for AI Safety

Déletang, Grégoire, Grau-Moya, Jordi, Martic, Miljan, Genewein, Tim, McGrath, Tom, Mikulik, Vladimir, Kunesch, Markus, Legg, Shane, Ortega, Pedro A.

As machine learning systems become more powerful they also become increasingly unpredictable and opaque. Yet, finding human-understandable explanations of how they work is essential for their safe deployment. This technical report illustrates a methodology for investigating the causal mechanisms that drive the behaviour of artificial agents. Six use cases are covered, each addressing a typical question an analyst might ask about an agent. In particular, we show that each question cannot be addressed by pure observation alone, but instead requires conducting experiments with systematically chosen manipulations so as to generate the correct causal evidence.

agent, intervention, probability, (14 more...)

2103.03938

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report > Strength High (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Dulian, Albert, Murray, John C.

Exploiting latent representation of sparse semantic layers for improved short-term motion prediction with Capsule Networks

As urban environments manifest high levels of complexity it is of vital importance that safety systems embedded within autonomous vehicles (AVs) are able to accurately anticipate short-term future motion of nearby agents. This problem can be further understood as generating a sequence of coordinates describing the future motion of the tracked agent. Various proposed approaches demonstrate significant benefits of using a rasterised top-down image of the road, with a combination of Convolutional Neural Networks (CNNs), for extraction of relevant features that define the road structure (eg. driveable areas, lanes, walkways). In contrast, this paper explores use of Capsule Networks (CapsNets) in the context of learning a hierarchical representation of sparse semantic layers corresponding to small regions of the High-Definition (HD) map. Each region of the map is dismantled into separate geometrical layers that are extracted with respect to the agent's current position. By using an architecture based on CapsNets the model is able to retain hierarchical relationships between detected features within images whilst also preventing loss of spatial data often caused by the pooling operation. We train and evaluate our model on publicly available dataset nuTonomy scenes and compare it to recently published methods. We show that our model achieves significant improvement over recently published works on deterministic prediction, whilst drastically reducing the overall size of the network.

agent, prediction, representation, (15 more...)

2103.01644

Country:

Europe > United Kingdom > England > East Yorkshire > Hull (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Singapore (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report (1.00)

Industry: Transportation (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)

Willemsen, Daniël, Coppola, Mario, de Croon, Guido C. H. E.

MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials, a property known as sample efficiency. This research thus investigates the use of learned world models to improve sample efficiency. We present a novel multi-agent model-based RL algorithm: Multi-Agent Model-Based Policy Optimization (MAMBPO), utilizing the Centralized Learning for Decentralized Execution (CLDE) framework. CLDE algorithms allow a group of agents to act in a fully decentralized manner after training. This is a desirable property for many systems comprising of multiple robots. MAMBPO uses a learned world model to improve sample efficiency compared to model-free Multi-Agent Soft Actor-Critic (MASAC). We demonstrate this on two simulated multi-robot tasks, where MAMBPO achieves a similar performance to MASAC, but requires far fewer samples to do so. Through this, we take an important step towards making real-life learning for multi-robot systems possible.

agent, algorithm, mambpo, (13 more...)

2103.03662

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Nekoei, Hadi, Badrinaaraayanan, Akilesh, Courville, Aaron, Chandar, Sarath

Continuous Coordination As a Realistic Scenario for Lifelong Learning

arXiv.org Artificial IntelligenceMar-4-2021

Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2103.03216

Country: North America > Canada > Quebec > Montreal (0.04)

Genre:

Instructional Material (0.57)
Research Report (0.50)
Overview (0.46)

Industry:

Education > Educational Setting > Continuing Education (0.83)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceMar-3-2021, 04:00:11 GMT

Microsoft brings RPA to Windows 10 with new Power Platform products

Microsoft announced AI-focused Power Platform products at its Microsoft Ignite 2021 conference, which kicked off in earnest today. Among the highlights is Power Automate Desktop for Windows 10 users, a robotic process automation service (RPA) that automates tasks within Windows across various apps. New Power Virtual Agents features were also unveiled. RPA -- technology that automates monotonous, repetitive chores traditionally performed by human workers -- is big business. Forrester estimates that RPA and other AI subfields created jobs for 40% of companies in 2019 and that a tenth of startups now employ more digital workers than human ones.

microsoft, power platform product, windows 10, (9 more...)

#artificialintelligence

Industry: Information Technology > Software (0.74)

Technology:

Information Technology > Artificial Intelligence > Robots (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.41)
Information Technology > Artificial Intelligence > Natural Language (0.35)