Agents
Intelligent Agent for Hurricane Emergency Identification and Text Information Extraction from Streaming Social Media Big Data
Huang, Jingwei, Khallouli, Wael, Rabadi, Ghaith, Seck, Mamadou
This paper presents our research on leveraging social media Big Data and AI to support hurricane disaster emergency response. The current practice of hurricane emergency response for rescue highly relies on emergency call centres. The more recent Hurricane Harvey event reveals the limitations of the current systems. We use Hurricane Harvey and the associated Houston flooding as the motivating scenario to conduct research and develop a prototype as a proof-of-concept of using an intelligent agent as a complementary role to support emergency centres in hurricane emergency response. This intelligent agent is used to collect real-time streaming tweets during a natural disaster event, to identify tweets requesting rescue, to extract key information such as address and associated geocode, and to visualize the extracted information in an interactive map in decision supports. Our experiment shows promising outcomes and the potential application of the research in support of hurricane emergency response.
Multi-Context Systems: Dynamics and Evolution (Pre-Print of "Multi-context systems in dynamic environments")
Cabalar, Pedro, Costantini, Stefania, De Gasperis, Giovanni, Formisano, Andrea
Multi-Context Systems (MCS) model in Computational Logic distributed systems composed of heterogeneous sources, or "contexts", interacting via special rules called "bridge rules". In this paper, we consider how to enhance flexibility and generality in bridge-rules definition and application. In particular, we introduce and discuss some formal extensions of MCSs useful for a practical use in dynamic environments, and we try to provide guidelines for implementations
A Game-Theoretic Approach to Multi-Agent Trust Region Optimization
Wen, Ying, Chen, Hui, Yang, Yaodong, Tian, Zheng, Li, Minne, Chen, Xu, Wang, Jun
Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement guarantee at every iteration. Nonetheless, when applied in multi-agent settings, the guarantee of trust region methods no longer holds because an agent's payoff is also affected by other agents' adaptive behaviors. To tackle this problem, we conduct a game-theoretical analysis in the policy space, and propose a multi-agent trust region learning method (MATRL), which enables trust region optimization for multi-agent learning. Specifically, MATRL finds a stable improvement direction that is guided by the solution concept of Nash equilibrium at the meta-game level. We derive the monotonic improvement guarantee in multi-agent settings and empirically show the local convergence of MATRL to stable fixed points in the two-player rotational differential game. To test our method, we evaluate MATRL in both discrete and continuous multiplayer general-sum games including checker and switch grid worlds, multi-agent MuJoCo, and Atari games. Results suggest that MATRL significantly outperforms strong multi-agent reinforcement learning baselines.
A New Formalism, Method and Open Issues for Zero-Shot Coordination
Treutlein, Johannes, Dennis, Michael, Oesterheld, Caspar, Foerster, Jakob
In many coordination problems, independently reasoning humans are able to discover mutually compatible policies. In contrast, independently trained self-play policies are often mutually incompatible. Zero-shot coordination (ZSC) has recently been proposed as a new frontier in multi-agent reinforcement learning to address this fundamental issue. Prior work approaches the ZSC problem by assuming players can agree on a shared learning algorithm but not on labels for actions and observations, and proposes other-play as an optimal solution. However, until now, this "label-free" problem has only been informally defined. We formalize this setting as the label-free coordination (LFC) problem by defining the label-free coordination game. We show that other-play is not an optimal solution to the LFC problem as it fails to consistently break ties between incompatible maximizers of the other-play objective. We introduce an extension of the algorithm, other-play with tie-breaking, and prove that it is optimal in the LFC problem and an equilibrium in the LFC game. Since arbitrary tie-breaking is precisely what the ZSC setting aims to prevent, we conclude that the LFC problem does not reflect the aims of ZSC. To address this, we introduce an alternative informal operationalization of ZSC as a starting point for future work.
Multi-Receiver Online Bayesian Persuasion
Castiglioni, Matteo, Marchesi, Alberto, Celli, Andrea, Gatti, Nicola
Bayesian persuasion studies how an informed sender should partially disclose information to influence the behavior of a self-interested receiver. Classical models make the stringent assumption that the sender knows the receiver's utility. This can be relaxed by considering an online learning framework in which the sender repeatedly faces a receiver of an unknown, adversarially selected type. We study, for the first time, an online Bayesian persuasion setting with multiple receivers. We focus on the case with no externalities and binary actions, as customary in offline models. Our goal is to design no-regret algorithms for the sender with polynomial per-iteration running time. First, we prove a negative result: for any $0 < \alpha \leq 1$, there is no polynomial-time no-$\alpha$-regret algorithm when the sender's utility function is supermodular or anonymous. Then, we focus on the case of submodular sender's utility functions and we show that, in this case, it is possible to design a polynomial-time no-$(1 - \frac{1}{e})$-regret algorithm. To do so, we introduce a general online gradient descent scheme to handle online learning problems with a finite number of possible loss functions. This requires the existence of an approximate projection oracle. We show that, in our setting, there exists one such projection oracle which can be implemented in polynomial time.
A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising
Wen, Chao, Xu, Miao, Zhang, Zhilin, Zheng, Zhenzhe, Wang, Yuhui, Liu, Xiangyu, Rong, Yu, Xie, Dong, Tan, Xiaoyang, Yu, Chuan, Xu, Jian, Wu, Fan, Chen, Guihai, Zhu, Xiaoqiang
In online advertising, auto-bidding has become an essential tool for advertisers to optimize their preferred ad performance metrics by simply expressing the high-level campaign objectives and constraints. Previous works consider the design of auto-bidding agents from the single-agent view without modeling the mutual influence between agents. In this paper, we instead consider this problem from the perspective of a distributed multi-agent system, and propose a general Multi-Agent reinforcement learning framework for Auto-Bidding, namely MAAB, to learn the auto-bidding strategies. First, we investigate the competition and cooperation relation among auto-bidding agents, and propose temperature-regularized credit assignment for establishing a mixed cooperative-competitive paradigm. By carefully making a competition and cooperation trade-off among the agents, we can reach an equilibrium state that guarantees not only individual advertiser's utility but also the system performance (social welfare). Second, due to the observed collusion behaviors of bidding low prices underlying the cooperation, we further propose bar agents to set a personalized bidding bar for each agent, and then to alleviate the degradation of revenue. Third, to deploy MAAB to the large-scale advertising system with millions of advertisers, we propose a mean-field approach. By grouping advertisers with the same objective as a mean auto-bidding agent, the interactions among advertisers are greatly simplified, making it practical to train MAAB efficiently. Extensive experiments on the offline industrial dataset and Alibaba advertising platform demonstrate that our approach outperforms several baseline methods in terms of social welfare and guarantees the ad platform's revenue.
Army researchers develop innovative framework for training AI
Army researchers have developed a pioneering framework that provides a baseline for the development of collaborative multi-agent systems. The framework is detailed in the survey paper "Survey of recent multi-agent reinforcement learning algorithms utilizing centralized training," which is featured in the SPIE Digital Library. Researchers said the work will support research in reinforcement learning approaches for developing collaborative multi-agent systems such as teams of robots that could work side-by-side with future soldiers. "We propose that the underlying information sharing mechanism plays a critical role in centralized learning for multi-agent systems, but there is limited study of this phenomena within the research community," said Army researcher and computer scientist Dr. Piyush K. Sharma of the U.S. Army Combat Capabilities Development Command, known as DEVCOM, Army Research Laboratory. "We conducted this survey of the state-of-the-art in reinforcement learning algorithms and their information sharing paradigms as a basis for asking fundamental questions on centralized learning for multi-agent systems that would improve their ability to work together."
Local non-Bayesian social learning with stubborn agents
Vial, Daniel, Subramanian, Vijay
We study a social learning model in which agents iteratively update their beliefs about the true state of the world using private signals and the beliefs of other agents in a non-Bayesian manner. Some agents are stubborn, meaning they attempt to convince others of an erroneous true state (modeling fake news). We show that while agents learn the true state on short timescales, they "forget" it and believe the erroneous state to be true on longer timescales. Using these results, we devise strategies for seeding stubborn agents so as to disrupt learning, which outperform intuitive heuristics and give novel insights regarding vulnerabilities in social learning.
Hard Choices in Artificial Intelligence
Dobbe, Roel, Gilbert, Thomas Krendl, Mintz, Yonatan
As AI systems are integrated into high stakes social domains, researchers now examine how to design and operate them in a safe and ethical manner. However, the criteria for identifying and diagnosing safety risks in complex social contexts remain unclear and contested. In this paper, we examine the vagueness in debates about the safety and ethical behavior of AI systems. We show how this vagueness cannot be resolved through mathematical formalism alone, instead requiring deliberation about the politics of development as well as the context of deployment. Drawing from a new sociotechnical lexicon, we redefine vagueness in terms of distinct design challenges at key stages in AI system development. The resulting framework of Hard Choices in Artificial Intelligence (HCAI) empowers developers by 1) identifying points of overlap between design decisions and major sociotechnical challenges; 2) motivating the creation of stakeholder feedback channels so that safety issues can be exhaustively addressed. As such, HCAI contributes to a timely debate about the status of AI development in democratic societies, arguing that deliberation should be the goal of AI Safety, not just the procedure by which it is ensured.