Goto

Collaborating Authors

 Agents


Bounds on the Cost of Stabilizing a Cooperative Game

Journal of Artificial Intelligence Research

A key issue in cooperative game theory is coalitional stability, usually captured by the notion of the core---the set of outcomes that are resistant to group deviations. However, some coalitional games have empty cores, and any outcome in such a game is unstable. We investigate the possibility of stabilizing a coalitional game by using subsidies. We consider scenarios where an external party that is interested in having the players work together offers a supplemental payment to the grand coalition, or, more generally, a particular coalition structure. This payment is conditional on players not deviating from this coalition structure, and may be divided among the players in any way they wish. We define the cost of stability as the minimum external payment that stabilizes the game. We provide tight bounds on the cost of stability, both for games where the coalitional values are nonnegative (profit-sharing games) and for games where the coalitional values are nonpositive (cost-sharing games), under natural assumptions on the characteristic function, such as superadditivity, anonymity, or both. We also investigate the relationship between the cost of stability and several variants of the least core. Finally, we study the computational complexity of problems related to the cost of stability, with a focus on weighted voting games.


A Summary of Adaptation of Techniques from Search-based Optimal Multi-Agent Path Finding Solvers to Compilation-based Approach

arXiv.org Artificial Intelligence

In the multi-agent path finding problem (MAPF) we are given a set of agents each with respective start and goal positions. The task is to find paths for all agents while avoiding collisions aiming to minimize an objective function. Two such common objective functions is the sum-of-costs and the makespan. Many optimal solvers were introduced in the past decade - two prominent categories of solvers can be distinguished: search-based solvers and compilation-based solvers. Search-based solvers were developed and tested for the sum-of-costs objective while the most prominent compilation-based solvers that are built around Boolean satisfiability (SAT) were designed for the makespan objective. Very little was known on the performance and relevance of the compilation-based approach on the sum-of-costs objective. In this paper we show how to close the gap between these cost functions in the compilation-based approach. Moreover we study applicability of various techniques developed for search-based solvers in the compilation-based approach. A part of this paper introduces a SAT-solver that is directly aimed to solve the sum-of-costs objective function. Using both a lower bound on the sum-of-costs and an upper bound on the makespan, we are able to have a reasonable number of variables in our SAT encoding. We then further improve the encoding by borrowing ideas from ICTS, a search-based solver. Experimental evaluation on several domains show that there are many scenarios where our new SAT-based methods outperforms the best variants of previous sum-of-costs search solvers - the ICTS, CBS algorithms, and ICBS algorithms.


Decentralized Decision-Making Over Multi-Task Networks

arXiv.org Machine Learning

In important applications involving multi-task networks with multiple objectives, agents in the network need to decide between these multiple objectives and reach an agreement about which single objective to follow for the network. In this work we propose a distributed decision-making algorithm. The agents are assumed to observe data that may be generated by different models. Through localized interactions, the agents reach agreement about which model to track and interact with each other in order to enhance the network performance. We investigate the approach for both static and mobile networks. The simulations illustrate the performance of the proposed strategies.


Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks

arXiv.org Machine Learning

Learning when to communicate and doing that effectively is essential in multi-agent tasks. Recent works show that continuous communication allows efficient training with back-propagation in multi-agent scenarios, but have been restricted to fully-cooperative tasks. In this paper, we present Individualized Controlled Continuous Communication Model (IC3Net) which has better training efficiency than simple continuous communication model, and can be applied to semi-cooperative and competitive settings along with the cooperative settings. IC3Net controls continuous communication with a gating mechanism and uses individualized rewards foreach agent to gain better performance and scalability while fixing credit assignment issues. Using variety of tasks including StarCraft BroodWars explore and combat scenarios, we show that our network yields improved performance and convergence rates than the baselines as the scale increases. Our results convey that IC3Net agents learn when to communicate based on the scenario and profitability.


In the Quest for General Intelligence, AIs Are Chasing Chickens in Minecraft

IEEE Spectrum Robotics

If artificial intelligence (AI) agents are to become real players in society, using their machine abilities to complement our human strengths, they must first become players in the video game of Minecraft. And to prove themselves in Minecraft, they must work together to capture animals in a maze, build towers of blocks, and hunt for treasure while fighting off skeletons. That, anyway, is the premise of a competition organized by Microsoft, Queen Mary University of London, and crowdAI (a platform for data-science challenges). Next month, the organizers will announce the winner--the team that created an AI that could best observe its Minecraft environment, determine which of three missions it had to accomplish, and then collaborate with another AI agent to carry out that mission. By emphasizing adaptability and cooperation, the organizers aimed to encourage research on AI agents that could one day interact with humans to accomplish tasks in the real world.


Optimal Network Topology for Effective Collective Response

arXiv.org Artificial Intelligence

Natural, social, and artificial multi-agent systems usually operate in dynamic environments, where the ability to respond to changing circumstances is a crucial feature. An effective collective response requires suitable information transfer among agents, and thus is critically dependent on the agents' interaction network. In order to investigate the influence of the network topology on collective response, we consider an archetypal model of distributed decision-making---the leader-follower linear consensus---and study the collective capacity of the system to follow a dynamic driving signal (the "leader") for a range of topologies and system sizes. The analysis reveals a nontrivial relationship between optimal topology and frequency of the driving signal. Interestingly, the response is optimal when each individual interacts with a certain number of agents which decreases monotonically with the frequency and, for large enough systems, is independent of the size of the system. This phenomenology is investigated in experiments of collective motion using a swarm of land robots. The emergent collective response to both a slow- and a fast-changing leader is measured and analyzed for a range of interaction topologies. These results have far-reaching practical implications for the design and understanding of distributed systems, since they highlight that a dynamic rewiring of the interaction network is paramount to the effective collective operations of multi-agent systems at different time-scales.


Lifelong Testing of Smart Autonomous Systems by Shepherding a Swarm of Watchdog Artificial Intelligence Agents

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) technologies could be broadly categorised into Analytics and Autonomy. Analytics focuses on algorithms offering perception, comprehension, and projection of knowledge gleaned from sensorial data. Autonomy revolves around decision making, and influencing and shaping the environment through action production. A smart autonomous system (SAS) combines analytics and autonomy to understand, learn, decide and act autonomously. To be useful, SAS must be trusted and that requires testing. Lifelong learning of a SAS compounds the testing process. In the remote chance that it is possible to fully test and certify the system pre-release, which is theoretically an undecidable problem, it is near impossible to predict the future behaviours that these systems, alone or collectively, will exhibit. While it may be feasible to severely restrict such systems\textquoteright \ learning abilities to limit the potential unpredictability of their behaviours, an undesirable consequence may be severely limiting their utility. In this paper, we propose the architecture for a watchdog AI (WAI) agent dedicated to lifelong functional testing of SAS. We further propose system specifications including a level of abstraction whereby humans shepherd a swarm of WAI agents to oversee an ecosystem made of humans and SAS. The discussion extends to the challenges, pros, and cons of the proposed concept.


Deception, Identity, and Security

Communications of the ACM

"When the world is destroyed, it will be destroyed not by its madmen but by the sanity of its experts and the superior ignorance of its bureaucrats." Decades before the advent of the Internet, Fernando António Nogueira Pessoa assumed a variety of identities with the ease that has become common in cyber-social platforms--those where cyber technologies play a part in human activity (for example, online banking, and social networks). Pessoa, a Portuguese poet, writer, literary critic, translator, publisher, and philosopher, wrote under his own name as well as 75 imaginary identities. He would write poetry or prose using one identity, then criticize that writing using another identity, then defend the original writing using yet another identity. Described by author Carmela Ciuraru as "the loving ringmaster, director, and traffic cop of his literary crew," Pessoa is one of the foremost Portuguese poets and a contributor to the Western canon. The story of Pessoa illustrates a key insight that holds true for the cyber-social systems of today: Identity costs little in the way of minting, forming, and maintaining yet demands a high price for its timely and accurate attribution to physical agency. Along with the low cost of minting and maintaining identities, a lack of constraints on using identities is a primary factor that facilitates adversarial innovations that rely on deception. With these factors in mind, we study the following problem: Will it be possible to engineer a decentralized system that can enforce honest usage of identity via mutual challenges and costly consequences when challenges fail? The success of such an approach will remedy currently deteriorating situations without requiring new infrastructure. For example, such a system should be able to reduce fake persons in social engineering attacks, malware that mimics the attributes of trusted software, and Sybil attacks that use fake identities to penetrate ad hoc networks. Note that many cyber-physical facilities--those where a physical mechanism is controlled or monitored by computer algorithms and tied closely to the internet and its users (for example, autonomous cars, medical monitoring)--also aim to enable users to remain anonymous and carry out certain tasks with only a persistent but pseudonymous identity. This form of short-term identity (especially in the networks that are ad hoc, hastily formed, and short lived) can remain uncoupled from a user's physical identity and allow them to maintain a strong form of privacy control.


Hierarchical Macro Strategy Model for MOBA Game AI

arXiv.org Artificial Intelligence

The next challenge of game AI lies in Real Time Strategy (RTS) games. RTS games provide partially observable gaming environments, where agents interact with one another in an action space much larger than that of GO. Mastering RTS games requires both strong macro strategies and delicate micro level execution. Recently, great progress has been made in micro level execution, while complete solutions for macro strategies are still lacking. In this paper, we propose a novel learning-based Hierarchical Macro Strategy model for mastering MOBA games, a sub-genre of RTS games. Trained by the Hierarchical Macro Strategy model, agents explicitly make macro strategy decisions and further guide their micro level execution. Moreover, each of the agents makes independent strategy decisions, while simultaneously communicating with the allies through leveraging a novel imitated cross-agent communication mechanism. We perform comprehensive evaluations on a popular 5v5 Multiplayer Online Battle Arena (MOBA) game. Our 5-AI team achieves a 48% winning rate against human player teams which are ranked top 1% in the player ranking system.


On Improving Decentralized Hysteretic Deep Reinforcement Learning

arXiv.org Machine Learning

Recent successes of value-based multi-agent deep reinforcement learning employ optimism by limiting underestimation updates of value function estimator, through carefully controlled learning rate (Omidshafiei et al., 2017) or reduced update probability (Palmer et al., 2018). To achieve full cooperation when learning independently, an agent must estimate the state values contingent on having optimal teammates; therefore, value overestimation is frequency injected to counteract negative effects caused by unobservable teammate sub-optimal policies and explorations. Aiming to solve this issue through automatic scheduling, this paper introduces a decentralized quantile estimator, which we found empirically to be more stable, sample efficient and more likely to converge to the joint optimal policy.