Goto

Collaborating Authors

 Agents


Noe: Norms Emergence and Robustness Based on Emotions in Multiagent Systems

arXiv.org Artificial Intelligence

Social norms characterize collective and acceptable group conducts in human society. Furthermore, some social norms emerge from interactions of agents or humans. To achieve agent autonomy and make norm satisfaction explainable, we include emotions into the normative reasoning process, which evaluate whether to comply or violate a norm. Specifically, before selecting an action to execute, an agent observes the environment and infer the state and consequences with its internal states after norm satisfaction or violation of a social norm. Both norm satisfaction and violation provoke further emotions, and the subsequent emotions affect norm enforcement. This paper investigates how modeling emotions affect the emergence and robustness of social norms via social simulation experiments. We find that an ability in agents to consider emotional responses to the outcomes of norm satisfaction and violation (1) promote norm compliance; and (2) improve societal welfare.


Revisiting Citizen Science Through the Lens of Hybrid Intelligence

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) can augment and sometimes even replace human cognition. Inspired by efforts to value human agency alongside productivity, we discuss the benefits of solving Citizen Science (CS) tasks with Hybrid Intelligence (HI), a synergetic mixture of human and artificial intelligence. Currently there is no clear framework or methodology on how to create such an effective mixture. Due to the unique participant-centered set of values and the abundance of tasks drawing upon both human common sense and complex 21st century skills, we believe that the field of CS offers an invaluable testbed for the development of HI and human-centered AI of the 21st century, while benefiting CS as well. In order to investigate this potential, we first relate CS to adjacent computational disciplines. Then, we demonstrate that CS projects can be grouped according to their potential for HI-enhancement by examining two key dimensions: the level of digitization and the amount of knowledge or experience required for participation. Finally, we propose a framework for types of human-AI interaction in CS based on established criteria of HI. This "HI lens" provides the CS community with an overview of several ways to utilize the combination of AI and human intelligence in their projects. It also allows the AI community to gain ideas on how developing AI in CS projects can further their own field.


Space Force scientist says it's 'imperative' military uses human augmentation by employing AI agents

Daily Mail - Science & tech

Combining humans with machines to create superhuman intelligence may soon no longer be the plot of science-fiction films, as the US Space Force's chief scientist say it will happen in'the coming decade.' Dr. Joel Mozer, speaking at an event at the Airforce Research Laboratory Wednesday, announced we are entering the age of'human augmentation,' which is crucial to the US's national defense in order to not'fall behind our strategic competitors.' However, his proposal does not turn humans into cyborgs, but employs'AI agents' to assist with strategic military planning. Mozer highlights the abilities seen in developed by a Google subsidiary, AlphaGo Zero, which was able to train itself to play the game of Go at a master level in just a few weeks. Mozer suggests the extortionary capabilities can lead to superhuman capabilities, by means of combining human ingenuity with the power, speed and efficiency of machines.


Human strategic decision making in parametrized games

arXiv.org Artificial Intelligence

Strong algorithms have been developed for game classes with many elements of complexity. For example, algorithms were recently able to defeat human professional players in 2-player [16, 3] and 6-player no-limit Texas hold'em [4]. These games have imperfect information, sequential actions, very large state spaces, and the latter has more than two players (solving multiplayer games is more challenging than two-player zero-sum games from a complexity-theoretic perspective). However, these algorithms all require an extremely large amount of computational resources for offline and/or online computations and for optimizing neural network hyperparameters. The algorithms also have a further limitation in that they are using all these resources just to solve for one very specific version of the game (e.g., Libratus and DeepStack assumed that all players start the hand with 200 times the big blind, and Pluribus assumed that all players start the hand with 100 times the big blind).


Collaborative Human-Agent Planning for Resilience

arXiv.org Artificial Intelligence

Intelligent agents powered by AI planning assist people in complex scenarios, such as managing teams of semi-autonomous vehicles. However, AI planning models may be incomplete, leading to plans that do not adequately meet the stated objectives, especially in unpredicted situations. Humans, who are apt at identifying and adapting to unusual situations, may be able to assist planning agents in these situations by encoding their knowledge into a planner at run-time. We investigate whether people can collaborate with agents by providing their knowledge to an agent using linear temporal logic (LTL) at run-time without changing the agent's domain model. We presented 24 participants with baseline plans for situations in which a planner had limitations, and asked the participants for workarounds for these limitations. We encoded these workarounds as LTL constraints. Results show that participants' constraints improved the expected return of the plans by 10% ($p < 0.05$) relative to baseline plans, demonstrating that human insight can be used in collaborative planning for resilience. However, participants used more declarative than control constraints over time, but declarative constraints produced plans less similar to the expectation of the participants, which could lead to potential trust issues.


Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

arXiv.org Artificial Intelligence

Standard dynamics models for continuous control make use of feedforward computation to predict the conditional distribution of next state and reward given current state and action using a multivariate Gaussian with a diagonal covariance structure. This modeling choice assumes that different dimensions of the next state and reward are conditionally independent given the current state and action and may be driven by the fact that fully observable physics-based simulation environments entail deterministic transition dynamics. In this paper, we challenge this conditional independence assumption and propose a family of expressive autoregressive dynamics models that generate different dimensions of the next state and reward sequentially conditioned on previous dimensions. We demonstrate that autoregressive dynamics models indeed outperform standard feedforward models in log-likelihood on heldout transitions. Furthermore, we compare different model-based and model-free off-policy evaluation (OPE) methods on RL Unplugged, a suite of offline MuJoCo datasets, and find that autoregressive dynamics models consistently outperform all baselines, achieving a new state-of-the-art. Finally, we show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer through data augmentation and improving performance using model-based planning. Model-based Reinforcement Learning (RL) aims to learn an approximate model of the environment's dynamics from existing logged interactions to facilitate efficient policy evaluation and optimization. Early work on Model-based RL uses simple tabular (Sutton, 1990; Moore and Atkeson, 1993; Peng and Williams, 1993) and locally linear (Atkeson et al., 1997) dynamics models, which often result in a large degree of model bias (Deisenroth and Rasmussen, 2011). Recent work adopts feedforward neural networks to model complex transition dynamics and improve generalization to unseen states and actions, achieving a high level of performance on standard RL benchmarks (Chua et al., 2018; Wang et al., 2019).


Rule-based Shielding for Partially Observable Monte-Carlo Planning

arXiv.org Artificial Intelligence

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online algorithm able to generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. The lack of an explicit representation however hinders policy interpretability and makes policy verification very complex. In this work, we propose two contributions. The first is a method for identifying unexpected actions selected by POMCP with respect to expert prior knowledge of the task. The second is a shielding approach that prevents POMCP from selecting unexpected actions. The first method is based on Satisfiability Modulo Theory (SMT). It inspects traces (i.e., sequences of belief-action-observation triplets) generated by POMCP to compute the parameters of logical formulas about policy properties defined by the expert. The second contribution is a module that uses online the logical formulas to identify anomalous actions selected by POMCP and substitutes those actions with actions that satisfy the logical formulas fulfilling expert knowledge. We evaluate our approach on Tiger, a standard benchmark for POMDPs, and a real-world problem related to velocity regulation in mobile robot navigation. Results show that the shielded POMCP outperforms the standard POMCP in a case study in which a wrong parameter of POMCP makes it select wrong actions from time to time. Moreover, we show that the approach keeps good performance also if the parameters of the logical formula are optimized using trajectories containing some wrong actions.


CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems

arXiv.org Artificial Intelligence

Automated systems that negotiate with humans have broad applications in pedagogy and conversational AI. To advance the development of practical negotiation systems, we present CaSiNo: a novel corpus of over a thousand negotiation dialogues in English. Participants take the role of campsite neighbors and negotiate for food, water, and firewood packages for their upcoming trip. Our design results in diverse and linguistically rich negotiations while maintaining a tractable, closed-domain environment. Inspired by the literature in human-human negotiations, we annotate persuasion strategies and perform correlation analysis to understand how the dialogue behaviors are associated with the negotiation performance. We further propose and evaluate a multi-task framework to recognize these strategies in a given utterance. We find that multi-task learning substantially improves the performance for all strategy labels, especially for the ones that are the most skewed. We release the dataset, annotations, and the code to propel future work in human-machine negotiations: https://github.com/kushalchawla/CaSiNo


Dynamic Cat Swarm Optimization Algorithm for Backboard Wiring Problem

arXiv.org Artificial Intelligence

This paper presents a powerful swarm intelligence meta-heuristic optimization algorithm called Dynamic Cat Swarm Optimization. The formulation is through modifying the existing Cat Swarm Optimization. The original Cat Swarm Optimization suffers from the shortcoming of "premature convergence", which is the possibility of entrapment in local optima which usually happens due to the off-balance between exploration and exploitation phases. Therefore, the proposed algorithm suggests a new method to provide a proper balance between these phases by modifying the selection scheme and the seeking mode of the algorithm. To evaluate the performance of the proposed algorithm, 23 classical test functions, 10 modern test functions (CEC 2019) and a real world scenario are used. In addition, the Dimension-wise diversity metric is used to measure the percentage of the exploration and exploitation phases. The optimization results show the effectiveness of the proposed algorithm, which ranks first compared to several well-known algorithms available in the literature. Furthermore, statistical methods and graphs are also used to further confirm the outperformance of the algorithm. Finally, the conclusion as well as future directions to further improve the algorithm are discussed.


Loss Functions, Axioms, and Peer Review

Journal of Artificial Intelligence Research

It is common to see a handful of reviewers reject a highly novel paper, because they view, say, extensive experiments as far more important than novelty, whereas the community as a whole would have embraced the paper. More generally, the disparate mapping of criteria scores to final recommendations by different reviewers is a major source of inconsistency in peer review. In this paper we present a framework inspired by empirical risk minimization (ERM) for learning the community's aggregate mapping. The key challenge that arises is the specification of a loss function for ERM. We consider the class of L(p,q) loss functions, which is a matrix-extension of the standard class of Lp losses on vectors; here the choice of the loss function amounts to choosing the hyperparameters p and q. To deal with the absence of ground truth in our problem, we instead draw on computational social choice to identify desirable values of the hyperparameters p and q. Specifically, we characterize p=q=1 as the only choice of these hyperparameters that satisfies three natural axiomatic properties. Finally, we implement and apply our approach to reviews from IJCAI 2017.