Goto

Collaborating Authors

 Agents


Multi-winner Approval Voting Goes Epistemic

arXiv.org Artificial Intelligence

Epistemic voting interprets votes as noisy signals about a ground truth. We consider contexts where the truth consists of a set of objective winners, knowing a lower and upper bound on its cardinality. A prototypical problem for this setting is the aggregation of multi-label annotations with prior knowledge on the size of the ground truth. We posit noise models, for which we define rules that output an optimal set of winners. We report on experiments on multi-label annotations (which we collected).


Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss

arXiv.org Artificial Intelligence

In a cooperative multiagent system, a collection of agents executes a joint policy in order to achieve some common objective. The successful deployment of such systems hinges on the availability of reliable inter-agent communication. However, many sources of potential disruption to communication exist in practice, such as radio interference, hardware failure, and adversarial attacks. In this work, we develop joint policies for cooperative multiagent systems that are robust to potential losses in communication. More specifically, we develop joint policies for cooperative Markov games with reach-avoid objectives. First, we propose an algorithm for the decentralized execution of joint policies during periods of communication loss. Next, we use the total correlation of the state-action process induced by a joint policy as a measure of the intrinsic dependencies between the agents. We then use this measure to lower-bound the performance of a joint policy when communication is lost. Finally, we present an algorithm that maximizes a proxy to this lower bound in order to synthesize minimum-dependency joint policies that are robust to communication loss. Numerical experiments show that the proposed minimum-dependency policies require minimal coordination between the agents while incurring little to no loss in performance; the total correlation value of the synthesized policy is one fifth of the total correlation value of the baseline policy which does not take potential communication losses into account. As a result, the performance of the minimum-dependency policies remains consistently high regardless of whether or not communication is available. By contrast, the performance of the baseline policy decreases by twenty percent when communication is lost.


Detecting danger in gridworlds using Gromov's Link Condition

arXiv.org Artificial Intelligence

Gridworlds have been long-utilised in AI research, particularly in reinforcement learning, as they provide simple yet scalable models for many real-world applications such as robot navigation, emergent behaviour, and operations research. We initiate a study of gridworlds using the mathematical framework of reconfigurable systems and state complexes due to Abrams, Ghrist & Peterson. State complexes represent all possible configurations of a system as a single geometric space, thus making them conducive to study using geometric, topological, or combinatorial methods. The main contribution of this work is a modification to the original Abrams, Ghrist & Peterson setup which we believe is more naturally-suited to the context of gridworlds. With this modification, the state complexes may exhibit geometric defects (failure of Gromov's Link Condition), however, we argue that these failures can indicate undesirable or dangerous states in the gridworld. Our results provide a novel method for seeking guaranteed safety limitations in discrete task environments with single or multiple agents, and offer potentially useful geometric and topological information for incorporation in or analysis of machine learning systems.


Bayesian Promised Persuasion: Dynamic Forward-Looking Multiagent Delegation with Informational Burning

arXiv.org Artificial Intelligence

This work studies a dynamic mechanism design problem in which a principal delegates decision makings to a group of privately-informed agents without the monetary transfer or burning. We consider that the principal privately possesses complete knowledge about the state transitions and study how she can use her private observation to support the incentive compatibility of the delegation via informational burning, a process we refer to as the looking-forward persuasion. The delegation mechanism is formulated in which the agents form belief hierarchies due to the persuasion and play a dynamic Bayesian game. We propose a novel randomized mechanism, known as Bayesian promised delegation (BPD), in which the periodic incentive compatibility is guaranteed by persuasions and promises of future delegations. We show that the BPD can achieve the same optimal social welfare as the original mechanism in stationary Markov perfect Bayesian equilibria. A revelation-principle-like design regime is established to show that the persuasion with belief hierarchies can be fully characterized by correlating the randomization of the agents' local BPD mechanisms with the persuasion as a direct recommendation of the future promises.


Cooperative Multi-Agent Deep Reinforcement Learning for Reliable Surveillance via Autonomous Multi-UAV Control

arXiv.org Artificial Intelligence

CCTV-based surveillance using unmanned aerial vehicles (UAVs) is considered a key technology for security in smart city environments. This paper creates a case where the UAVs with CCTV-cameras fly over the city area for flexible and reliable surveillance services. UAVs should be deployed to cover a large area while minimize overlapping and shadow areas for a reliable surveillance system. However, the operation of UAVs is subject to high uncertainty, necessitating autonomous recovery systems. This work develops a multi-agent deep reinforcement learning-based management scheme for reliable industry surveillance in smart city applications. The core idea this paper employs is autonomously replenishing the UAV's deficient network requirements with communications. Via intensive simulations, our proposed algorithm outperforms the state-of-the-art algorithms in terms of surveillance coverage, user support capability, and computational costs.


Is it personal? The impact of personally relevant robotic failures (PeRFs) on humans' trust, likeability, and willingness to use the robot

arXiv.org Artificial Intelligence

In three laboratory experiments, we examine the impact of personally relevant failures (PeRFs) on perceptions of a collaborative robot. PeR is determined by how much a specific issue applies to a particular person, i.e., it affects one's own goals and values. We hypothesized that PeRFs would reduce trust in the robot and the robot's Likeability and Willingness to Use (LWtU) more than failures that are not personal to participants. To achieve PeR in human-robot interaction, we utilized three different manipulation mechanisms: A) damage to property, B) financial loss, and C) first-person versus third-person failure scenarios. In total, 132 participants engaged with a robot in person during a collaborative task of laundry sorting. All three experiments took place in the same experimental environment, carefully designed to simulate a realistic laundry sorting scenario. Results indicate that the impact of PeRFs on perceptions of the robot varied across the studies. In experiments A and B, the encounters with PeRFs reduced trust significantly relative to a no failure session. But not entirely for LWtU. In experiment C, the PeR manipulation had no impact. The work highlights challenges and adjustments needed for studying robotic failures in laboratory settings. We show that PeR manipulations affect how users perceive a failing robot. The results bring about new questions regarding failure types and their perceived severity on users' perception of the robot. Putting PeR aside, we observed differences in the way users perceive interaction failures compared (experiment C) to how they perceive technical ones (A and B).


A Survey of Opponent Modeling in Adversarial Domains

Journal of Artificial Intelligence Research

Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.


ArchABM: an agent-based simulator of human interaction with the built environment. $CO_2$ and viral load analysis for indoor air quality

arXiv.org Artificial Intelligence

Recent evidence suggests that SARS-CoV-2, which is the virus causing a global pandemic in 2020, is predominantly transmitted via airborne aerosols in indoor environments. This calls for novel strategies when assessing and controlling a building's indoor air quality (IAQ). IAQ can generally be controlled by ventilation and/or policies to regulate human-building-interaction. However, in a building, occupants use rooms in different ways, and it may not be obvious which measure or combination of measures leads to a cost- and energy-effective solution ensuring good IAQ across the entire building. Therefore, in this article, we introduce a novel agent-based simulator, ArchABM, designed to assist in creating new or adapt existing buildings by estimating adequate room sizes, ventilation parameters and testing the effect of policies while taking into account IAQ as a result of complex human-building interaction patterns. A recently published aerosol model was adapted to calculate time-dependent carbon dioxide ($CO_2$) and virus quanta concentrations in each room and inhaled $CO_2$ and virus quanta for each occupant over a day as a measure of physiological response. ArchABM is flexible regarding the aerosol model and the building layout due to its modular architecture, which allows implementing further models, any number and size of rooms, agents, and actions reflecting human-building interaction patterns. We present a use case based on a real floor plan and working schedules adopted in our research center. This study demonstrates how advanced simulation tools can contribute to improving IAQ across a building, thereby ensuring a healthy indoor environment.


Automatic Thoughts and Facial Expressions in Cognitive Restructuring with Virtual Agents

#artificialintelligence

Cognitive restructuring is a well-established mental health technique for amending automatic thoughts, which are distorted and biased beliefs about a situation, into objective and balanced thoughts. Since virtual agents can be used anytime and anywhere, they are expected to perform cognitive restructuring without being influenced by medical infrastructure or patients' stigma toward mental illness. Unfortunately, since the quantitative analysis of human-agent interaction is still insufficient, the effect on the user's cognitive state remains unclear. We collected interaction data between virtual agents and users to observe the mood improvements associated with changes in automatic thoughts that occur in user cognition and addressed the following two points: (1) implementation of a virtual agent that helps a user identify and evaluate automatic thoughts; (2) identification of the relationship between a user's facial expressions and the extent of the mood improvement subjectively felt by users during the human-agent interaction. We focus on these points because cognitive restructuring by a human therapist starts by identifying automatic thoughts and seeking sufficient evidence to find balanced thoughts (evaluation of automatic thoughts). Therapists also use such non-verbal behaviors as facial expressions to detect changes in a user's mood, which is an important indicator for guidance. Based on the results of this analysis, we provide a technical guidance framework that fully ...


Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents

arXiv.org Artificial Intelligence

Critical periods are phases during which a toddler's brain develops in spurts. To promote children's cognitive development, proper guidance is critical in this stage. However, it is not clear whether such a critical period also exists for the training of AI agents. Similar to human toddlers, well-timed guidance and multimodal interactions might significantly enhance the training efficiency of AI agents as well. To validate this hypothesis, we adapt this notion of critical periods to learning in AI agents and investigate the critical period in the virtual environment for AI agents. We formalize the critical period and Toddler-guidance learning in the reinforcement learning (RL) framework. Then, we built up a toddler-like environment with VECA toolkit to mimic human toddlers' learning characteristics. We study three discrete levels of mutual interaction: weak-mentor guidance (sparse reward), moderate mentor guidance (helper-reward), and mentor demonstration (behavioral cloning). We also introduce the EAVE dataset consisting of 30,000 real-world images to fully reflect the toddler's viewpoint. We evaluate the impact of critical periods on AI agents from two perspectives: how and when they are guided best in both uni- and multimodal learning. Our experimental results show that both uni- and multimodal agents with moderate mentor guidance and critical period on 1 million and 2 million training steps show a noticeable improvement. We validate these results with transfer learning on the EAVE dataset and find the performance advancement on the same critical period and the guidance.