AITopics

doi: 10.1613/jair.1.12233

AI Access Foundation

12233

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Oceania > Australia (0.04)
(3 more...)

Genre: Overview (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Dubey, Abhimanyu, Pentland, Alex

Provably Efficient Cooperative Multi-Agent Reinforcement Learning with Function Approximation

arXiv.org Machine LearningMar-8-2021

Cooperative multi-agent reinforcement learning (MARL) systems are widely prevalent in many engineering systems, e.g., robotic systems (Ding et al., 2020), power grids (Yu et al., 2014), traffic control (Bazzan, 2009), as well as team games (Zhao et al., 2019). Increasingly, federated (Yang et al., 2019) and distributed (Peteiro-Barral & Guijarro-Berdiñas, 2013) machine learning is gaining prominence in industrial applications, and reinforcement learning in these large-scale settings is becoming of import in the research community as well (Zhuo et al., 2019; Liu et al., 2019). Recent research in the statistical learning community has focused on cooperative multi-agent decision-making algorithms with provable guarantees(Zhang et al., 2018b; Wai et al., 2018; Zhang et al., 2018a). However, prior work focuses on algorithms that, while are decentralized, provide guarantees on convergence (e.g., Zhang et al. (2018b)) but no finite-sample guarantees for regret, in contrast to efficient algorithms with function approximation proposed for single-agent RL (e.g., Jin et al. (2018, 2020); Yang et al. (2020)). Moreover, optimization in the decentralized multi-agent setting is also known to be non-convergent without assumptions (Tan, 1993). Developing no-regret multi-agent algorithms is therefore an important problem in RL. For the (relatively) easier problem of multi-agent multi-armed bandits, there has been significant recent interest in decentralized algorithms involving agents communicating over a network (Landgren et al., 2016a, 2018; Martínez-Rubio et al., 2019; Dubey & Pentland, 2020b), as well as in the distributed settings (Hillel et al., 2013; Wang et al., 2019). Since several application areas for distributed sequential decision-making regularly involve non-stationarity and contextual information (Polydoros & Nalpantidis, 2017), an MDP formulation can potentially provide stronger algorithms for these settings as well. Furthermore, no-regret algorithms in the single-agent RL setting with function approximation (e.g., Jin et al. (2020)) build on analysis techniques for contextual bandits, which leads us to the question - Can no-regret function approximation be extended to (decentralized) cooperative multi-agent reinforcement learning?

agent, algorithm, probability, (14 more...)

arXiv.org Machine Learning

2103.04972

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre:

Research Report (0.81)
Workflow (0.67)

Industry:

Leisure & Entertainment (0.67)
Energy > Power Industry (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceMar-8-2021

The AI Index 2021 Annual Report

Zhang, Daniel, Mishra, Saurabh, Brynjolfsson, Erik, Etchemendy, John, Ganguli, Deep, Grosz, Barbara, Lyons, Terah, Manyika, James, Niebles, Juan Carlos, Sellitto, Michael, Shoham, Yoav, Clark, Jack, Perrault, Raymond

Welcome to the fourth edition of the AI Index Report. This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford Institute for Human-Centered Artificial Intelligence (HAI). The AI Index Report tracks, collates, distills, and visualizes data related to artificial intelligence. Its mission is to provide unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, executives, journalists, and the general public to develop intuitions about the complex field of AI. The report aims to be the most credible and authoritative source for data and insights about AI in the world.

data mining, large language model, machine learning, (27 more...)

2103.06312

Country:

Asia > India (0.46)
Oceania > Australia (0.28)
Asia > Japan (0.28)
(110 more...)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Social Sector (1.00)
Media > News (1.00)
Leisure & Entertainment > Sports (1.00)
(17 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
(15 more...)

Cetin, Edoardo, Celiktutan, Oya

Domain-Robust Visual Imitation Learning with Mutual Information Constraints

arXiv.org Artificial IntelligenceMar-8-2021

Human beings are able to understand objectives and learn by simply observing others perform a task. Imitation learning methods aim to replicate such capabilities, however, they generally depend on access to a full set of optimal states and actions taken with the agent's actuators and from the agent's point of view. In this paper, we introduce a new algorithm - called Disentangling Generative Adversarial Imitation Learning (DisentanGAIL) - with the purpose of bypassing such constraints. Our algorithm enables autonomous agents to learn directly from high dimensional observations of an expert performing a task, by making use of adversarial learning with a latent representation inside the discriminator network. Such latent representation is regularized through mutual information constraints to incentivize learning only features that encode information about the completion levels of the task being demonstrated. This allows to obtain a shared feature space to successfully perform imitation while disregarding the differences between the expert's and the agent's domains. Empirically, our algorithm is able to efficiently imitate in a diverse range of control problems including balancing, manipulation and locomotive tasks, while being robust to various domain differences in terms of both environment appearance and agent embodiment.

agent, disentangail, information, (15 more...)

2103.05079

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.50)

Industry: Transportation > Ground (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

Shahar, Tomer (Ben Gurion University of the Negev) | Shekhar, Shashank (Ben Gurion University of the Negev) | Atzmon, Dor (Ben Gurion University of the Negev) | Saffidine, Abdallah (The University of New South Wales, Sydney, Australia) | Juba, Brendan (Washington University in St. Louis, USA) | Stern, Roni

Safe Multi-Agent Pathfinding with Time Uncertainty

Journal of Artificial Intelligence ResearchMar-7-2021

In many real-world scenarios, the time it takes for a mobile agent, e.g., a robot, to move from one location to another may vary due to exogenous events and be difficult to predict accurately. Planning in such scenarios is challenging, especially in the context of Multi-Agent Pathfinding (MAPF), where the goal is to find paths to multiple agents and temporal coordination is necessary to avoid collisions. In this work, we consider a MAPF problem with this form of time uncertainty, where we are only given upper and lower bounds on the time it takes each agent to move. The objective is to find a safe solution, which is a solution that can be executed by all agents and is guaranteed to avoid collisions. We propose two complete and optimal algorithms for finding safe solutions based on well-known MAPF algorithms, namely, A* with Operator Decomposition (A* + OD) and Conflict-Based Search (CBS). Experimentally, we observe that on several standard MAPF grids the CBS-based algorithm performs better. We also explore the option of online replanning in this context, i.e., modifying the agents' plans during execution, to reduce the overall execution cost. We consider two online settings: (a) when an agent can sense the current time and its current location, and (b) when the agents can also communicate seamlessly during execution. For each setting, we propose a replanning algorithm and analyze its behavior theoretically and empirically. Our experimental evaluation confirms that indeed online replanning in both settings can significantly reduce solution cost.

agent, artificial intelligence, conflict, (15 more...)

doi: 10.1613/jair.1.12397

AI Access Foundation

12397

Country:

Oceania > Australia > New South Wales (0.14)
North America > United States > California (0.14)
Europe > Spain > Catalonia (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Embodied Continual Learning Across Developmental Time Via Developmental Braitenberg Vehicles

Alicea, Bradly, Chakrabarty, Rishabh, Gopi, Akshara, Lim, Anson, Parent, Jesse

Bradly Alicea, Rishabh Chakrabarty, Akshara Gopi, Anson Lim, and Jesse Parent Abstract There is much to learn through synthesis of Developmental Biology, Cognitive Science and Computational Modeling. One lesson we can learn from this perspective is that the initialization of intelligent programs cannot solely rely on manipulation of numerous parameters. Our path forward is to present a design for developmentally-inspired learning agents based on the Braitenberg Vehicle. Using these agents to exemplify artificial embodied intelligence, we move closer to modeling embodied experience and morphogenetic growth as components of cognitive developmental capacity. We consider various factors regarding biological and cognitive development which influence the generation of adult phenotypes and the contingency of available developmental pathways. These mechanisms produce emergent connectivity with shifting weights and adaptive network topography, thus illustrating the importance of developmental processes in training neural networks. This approach provides a blueprint for adaptive agent behavior that might result from a developmental approach: namely by exploiting critical periods or growth and acquisition, an explicitly embodied network architecture, and a distinction between the assembly of neural networks and active learning on these networks. Introduction The process of biological development provides many novel lessons for machine learning and artificial intelligence.

developmental freedom, neural network, vehicle, (13 more...)

2103.05753

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data

Gu, Tu, Feng, Kaiyu, Cong, Gao, Long, Cheng, Wang, Zheng, Wang, Sheng

Despite the success of these learned indices in improving the performance Learned indices have been proposed to replace classic index structures of some types of queries, they still have various limitations, like B-Tree with machine learning (ML) models. They require e.g., they can only handle spatial point objects and limited types to replace both the indices and query processing algorithms currently of spatial queries, some only return approximate query results, deployed by the databases, and such a radical departure is and they either cannot handle updates or need a periodic rebuild likely to encounter challenges and obstacles. In contrast, we propose to retain high query efficiency (Detailed discussions are in Section a fundamentally different way of using ML techniques to 2). These limitations, together with the requirement that the improve on the query performance of the classic R-Tree without learned indices need a replacement of the index structures and the need of changing its structure or query processing algorithms.

dataset, node, query, (17 more...)

2103.04541

Country:

Asia > India (0.04)
Asia > China (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Ren, Zhongqiang, Rathinam, Sivakumar, Choset, Howie

Loosely Synchronized Search for Multi-agent Path Finding with Asynchronous Actions

Multi-agent path finding (MAPF) determines an ensemble of collision-free paths for multiple agents between their respective start and goal locations. Among the available MAPF planners for workspaces modeled as a graph, A*-based approaches have been widely investigated and have demonstrated their efficiency in numerous scenarios. However, almost all of these A*-based approaches assume that each agent executes an action concurrently in that all agents start and stop together. This article presents a natural generalization of MAPF with asynchronous actions where agents do not necessarily start and stop concurrently. The main contribution of the work is a proposed approach called Loosely Synchronized Search (LSS) that extends A*-based MAPF planners to handle asynchronous actions. We show LSS is complete and finds an optimal solution if one exists. We also combine LSS with other existing MAPF methods that aims to trade-off optimality for computational efficiency. Extensive numerical results are presented to corroborate the performance of the proposed approaches. Finally, we also verify the applicability of our method in the Robotarium, a remotely accessible swarm robotics research platform.

agent, algorithm, vertex, (16 more...)

2103.04516

Country:

North America > United States > Texas (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Pasternak, Katarzyna, Wu, Zishi, Visser, Ubbo, Lisetti, Christine

Let's be friends! A rapport-building 3D embodied conversational agent for the Human Support Robot

Partial subtle mirroring of nonverbal behaviors during conversations (also known as mimicking or parallel empathy), is essential for rapport building, which in turn is essential for optimal human-human communication outcomes. Mirroring has been studied in interactions between robots and humans, and in interactions between Embodied Conversational Agents (ECAs) and humans. However, very few studies examine interactions between humans and ECAs that are integrated with robots, and none of them examine the effect of mirroring nonverbal behaviors in such interactions. Our research question is whether integrating an ECA able to mirror its interlocutor's facial expressions and head movements (continuously or intermittently) with a human-service robot will improve the user's experience with the support robot that is able to perform useful mobile manipulative tasks (e.g. at home). Our contribution is the complex integration of an expressive ECA, able to track its interlocutor's face, and to mirror his/her facial expressions and head movements in real time, integrated with a human support robot such that the robot and the agent are fully aware of each others', and of the users', nonverbals cues. We also describe a pilot study we conducted towards answering our research question, which shows promising results for our forthcoming larger user study.

experiment, interaction, participant, (13 more...)

2103.04498

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Miami-Dade County > Coral Gables (0.04)
(2 more...)

Genre: Research Report > Experimental Study > Negative Result (0.55)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Ni, Tianwei, Li, Huao, Agrawal, Siddharth, Raja, Suhas, Jia, Fan, Gui, Yikang, Hughes, Dana, Lewis, Michael, Sycara, Katia

Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams.

agent, fortress, team performance, (16 more...)

2103.04439

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Hawaii (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)