AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

From Psychological Curiosity to Artificial Curiosity: Curiosity-Driven Learning in Artificial Intelligence Tasks

Sun, Chenyu, Qian, Hangwei, Miao, Chunyan

arXiv.org Artificial IntelligenceJan-20-2022

Psychological curiosity plays a significant role in human intelligence to enhance learning through exploration and information acquisition. In the Artificial Intelligence (AI) community, artificial curiosity provides a natural intrinsic motivation for efficient learning as inspired by human cognitive development; meanwhile, it can bridge the existing gap between AI research and practical application scenarios, such as overfitting, poor generalization, limited training samples, high computational cost, etc. As a result, curiosity-driven learning (CDL) has become increasingly popular, where agents are self-motivated to learn novel knowledge. In this paper, we first present a comprehensive review on the psychological study of curiosity and summarize a unified framework for quantifying curiosity as well as its arousal mechanism. Based on the psychological principle, we further survey the literature of existing CDL methods in the fields of Reinforcement Learning, Recommendation, and Classification, where both advantages and disadvantages as well as future work are discussed. As a result, this work provides fruitful insights for future CDL research and yield possible directions for further improvement.

agent, curiosity, intrinsic reward, (13 more...)

arXiv.org Artificial Intelligence

2201.083

Country:

Asia > Middle East > Jordan (0.04)
Asia > Singapore (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
(3 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(7 more...)

Add feedback

Multi-agent Covering Option Discovery based on Kronecker Product of Factor Graphs

Chen, Jiayu, Chen, Jingdi, Lan, Tian, Aggarwal, Vaneet

arXiv.org Artificial IntelligenceJan-20-2022

Covering option discovery has been developed to improve the exploration of reinforcement learning in single-agent scenarios with sparse reward signals, through connecting the most distant states in the embedding space provided by the Fiedler vector of the state transition graph. However, these option discovery methods cannot be directly extended to multi-agent scenarios, since the joint state space grows exponentially with the number of agents in the system. Thus, existing researches on adopting options in multi-agent scenarios still rely on single-agent option discovery and fail to directly discover the joint options that can improve the connectivity of the joint state space of agents. In this paper, we show that it is indeed possible to directly compute multi-agent options with collaborative exploratory behaviors among the agents, while still enjoying the ease of decomposition. Our key idea is to approximate the joint state space as a Kronecker graph -- the Kronecker product of individual agents' state transition graphs, based on which we can directly estimate the Fiedler vector of the joint state space using the Laplacian spectrum of individual agents' transition graphs. This decomposition enables us to efficiently construct multi-agent joint options by encouraging agents to connect the sub-goal joint states which are corresponding to the minimum or maximum values of the estimated joint Fiedler vector. The evaluation based on multi-agent collaborative tasks shows that the proposed algorithm can successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options, in terms of both faster exploration and higher cumulative rewards.

agent, multi-agent option, state space, (12 more...)

arXiv.org Artificial Intelligence

2201.08227

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)

Add feedback

Self-Awareness Safety of Deep Reinforcement Learning in Road Traffic Junction Driving

Cao, Zehong, Yun, Jie

arXiv.org Artificial IntelligenceJan-20-2022

Autonomous driving has been at the forefront of public interest, and a pivotal debate to widespread concerns is safety in the transportation system. Deep reinforcement learning (DRL) has been applied to autonomous driving to provide solutions for obstacle avoidance. However, in a road traffic junction scenario, the vehicle typically receives partial observations from the transportation environment, while DRL needs to rely on long-term rewards to train a reliable model by maximising the cumulative rewards, which may take the risk when exploring new actions and returning either a positive reward or a penalty in the case of collisions. Although safety concerns are usually considered in the design of a reward function, they are not fully considered as the critical metric to directly evaluate the effectiveness of DRL algorithms in autonomous driving. In this study, we evaluated the safety performance of three baseline DRL models (DQN, A2C, and PPO) and proposed a self-awareness module from an attention mechanism for DRL to improve the safety evaluation for an anomalous vehicle in a complex road traffic junction environment, such as intersection and roundabout scenarios, based on four metrics: collision rate, success rate, freezing rate, and total reward. Our two experimental results in the training and testing phases revealed the baseline DRL with poor safety performance, while our proposed self-awareness attention-DQN can significantly improve the safety performance in intersection and roundabout scenarios.

ego vehicle, scenario, vehicle, (13 more...)

arXiv.org Artificial Intelligence

2201.08116

Country:

Oceania > Australia > Tasmania > Hobart (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making

Parbhoo, Sonali, Joshi, Shalmali, Doshi-Velez, Finale

arXiv.org Machine LearningJan-20-2022

Assessing the effects of a policy based on observational data from a different policy is a common problem across several high-stake decision-making domains, and several off-policy evaluation (OPE) techniques have been proposed. However, these methods largely formulate OPE as a problem disassociated from the process used to generate the data (i.e. structural assumptions in the form of a causal graph). We argue that explicitly highlighting this association has important implications on our understanding of the fundamental limits of OPE. First, this implies that current formulation of OPE corresponds to a narrow set of tasks, i.e. a specific causal estimand which is focused on prospective evaluation of policies over populations or sub-populations. Second, we demonstrate how this association motivates natural desiderata to consider a general set of causal estimands, particularly extending the role of OPE for counterfactual off-policy evaluation at the level of individuals of the population. A precise description of the causal estimand highlights which OPE estimands are identifiable from observational data under the stated generative assumptions. For those OPE estimands that are not identifiable, the causal perspective further highlights where more experimental data is necessary, and highlights situations where human expertise can aid identification and estimation. Furthermore, many formalisms of OPE overlook the role of uncertainty entirely in the estimation process.We demonstrate how specifically characterising the causal estimand highlights the different sources of uncertainty and when human expertise can naturally manage this uncertainty. We discuss each of these aspects as actionable desiderata for future OPE research at scale and in-line with practical utility.

assumption, estimand, ope, (16 more...)

arXiv.org Machine Learning

2201.08262

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

Iconary: A pictionary-like game to improve the communication skills of AI agents

#artificialintelligenceJan-19-2022, 00:44:36 GMT

While artificial intelligence (AI) agents have become increasingly skilled at communicating with humans, they still struggle with several aspects of language, including complex semantics. The term semantics refers to the area of linguistics that relates to the meaning associated with specific words or logical connections between different concepts. A few years ago, researchers at Allen Institute for AI developed a game called Iconary, which is designed to improve the ability of AI techniques to communicate and make connections between different objects. In a recent paper pre-published on arXiv and presented at last year's ENMLP conference, the researchers introduced a more advanced version of the game and trained machine learning algorithms to play against each other or with humans. "Our paper is based on a project at AI2 aimed at training models to play Iconary, a Pictionary-based game we created, where a player has to guess what another player is drawing," Christopher Clark, one of the researchers who carried out the study, told TechXplore.

clark, guesser, iconary, (10 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.64)

Add feedback

When Is It Acceptable to Break the Rules? Knowledge Representation of Moral Judgement Based on Empirical Data

Awad, Edmond, Levine, Sydney, Loreggia, Andrea, Mattei, Nicholas, Rahwan, Iyad, Rossi, Francesca, Talamadupula, Kartik, Tenenbaum, Joshua, Kleiman-Weiner, Max

arXiv.org Artificial IntelligenceJan-19-2022

One of the most remarkable things about the human moral mind is its flexibility. We can make moral judgments about cases we have never seen before. We can decide that pre-established rules should be broken. We can invent novel rules on the fly. Capturing this flexibility is one of the central challenges in developing AI systems that can interpret and produce human-like moral judgment. This paper details the results of a study of real-world decision makers who judge whether it is acceptable to break a well-established norm: ``no cutting in line.'' We gather data on how human participants judge the acceptability of line-cutting in a range of scenarios. Then, in order to effectively embed these reasoning capabilities into a machine, we propose a method for modeling them using a preference-based structure, which captures a novel modification to standard ``dual process'' theories of moral judgment.

evaluation variable, person 0, scenario, (16 more...)

arXiv.org Artificial Intelligence

2201.07763

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.46)
Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.83)
(2 more...)

Add feedback

K-nearest Multi-agent Deep Reinforcement Learning for Collaborative Tasks with a Variable Number of Agents

Khorasgani, Hamed, Wang, Haiyan, Tang, Hsiu-Khuern, Gupta, Chetan

arXiv.org Artificial IntelligenceJan-18-2022

Traditionally, the performance of multi-agent deep reinforcement learning algorithms are demonstrated and validated in gaming environments where we often have a fixed number of agents. In many industrial applications, the number of available agents can change at any given day and even when the number of agents is known ahead of time, it is common for an agent to break during the operation and become unavailable for a period of time. In this paper, we propose a new deep reinforcement learning algorithm for multi-agent collaborative tasks with a variable number of agents. We demonstrate the application of our algorithm using a fleet management simulator developed by Hitachi to generate realistic scenarios in a production site.

agent, algorithm, vehicle, (13 more...)

arXiv.org Artificial Intelligence

2201.07092

Country: North America > United States > California > Santa Clara County > Santa Clara (0.05)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally Inattentive Reinforcement Learning

Mu, Tong, Zheng, Stephan, Trott, Alexander

arXiv.org Artificial IntelligenceJan-18-2022

Multi-agent reinforcement learning (MARL) is a powerful framework for studying emergent behavior in complex agent-based simulations. However, RL agents are often assumed to be rational and behave optimally, which does not fully reflect human behavior. Here, we study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model. RI models the cost of cognitive information processing using mutual information. Our RIRL framework generalizes and is more flexible than prior work by allowing for multi-timestep dynamics and information channels with heterogeneous processing costs. We evaluate RIRL in Principal-Agent (specifically manager-employee relations) problem settings of varying complexity where RI models information asymmetry (e.g. it may be costly for the manager to observe certain information about the employees). We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions. For instance, some forms of a Principal's inattention can increase Agent welfare due to increased compensation, while other forms of inattention can decrease Agent welfare by encouraging extra work effort. Additionally, new strategies emerge compared to those under rationality assumptions, e.g., Agents are incentivized to increase work effort. These results suggest RIRL is a powerful tool towards building AI agents that can mimic real human behavior.

agent, attention cost, inattention, (12 more...)

arXiv.org Artificial Intelligence

2202.01691

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia (0.04)
Europe > Italy > Sardinia > Cagliari (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Games (0.68)
Information Technology (0.46)
Education (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Standby-Based Deadlock Avoidance Method for Multi-Agent Pickup and Delivery Tasks

Yamauchi, Tomoki, Miyashita, Yuki, Sugawara, Toshiharu

arXiv.org Artificial IntelligenceJan-18-2022

The multi-agent pickup and delivery (MAPD) problem, in which multiple agents iteratively carry materials without collisions, has received significant attention. However, many conventional MAPD algorithms assume a specifically designed grid-like environment, such as an automated warehouse. Therefore, they have many pickup and delivery locations where agents can stay for a lengthy period, as well as plentiful detours to avoid collisions owing to the freedom of movement in a grid. By contrast, because a maze-like environment such as a search-and-rescue or construction site has fewer pickup/delivery locations and their numbers may be unbalanced, many agents concentrate on such locations resulting in inefficient operations, often becoming stuck or deadlocked. Thus, to improve the transportation efficiency even in a maze-like restricted environment, we propose a deadlock avoidance method, called standby-based deadlock avoidance (SBDA). SBDA uses standby nodes determined in real-time using the articulation-point-finding algorithm, and the agent is guaranteed to stay there for a finite amount of time. We demonstrated that our proposed method outperforms a conventional approach. We also analyzed how the parameters used for selecting standby nodes affect the performance.

agent, node, standby node, (16 more...)

arXiv.org Artificial Intelligence

2201.06014

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)

Genre: Research Report (0.50)

Industry: Transportation > Freight & Logistics Services (0.81)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Maria Gini wins the 2022 ACM/SIGAI Autonomous Agents Research Award

AIHubJan-17-2022, 12:47:26 GMT

Maria Gini is Professor of Computer Science and Engineering at the University of Minnesota, and has been at the forefront of the field of robotics and multi-agent systems for many years, consistently bringing AI into robotics. Her work has spanned both the design of novel algorithms and practical applications. These applications have been utilized in settings as varied as warehouses and hospitals, with uses such as surveillance, exploration, and search and rescue. Maria has been an active member and leader of the agents community since its inception. She has been a consistent mentor and role model, deeply committed to bringing diversity to the fields of AI, robotics, and computing.

artificial intelligence, maria gini win, sigai autonomous agent research award, (2 more...)

AIHub

Country: North America > United States > Minnesota (0.29)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback