Agent Societies
Impact of different belief facets on agents' decision -- a refined cognitive architecture
Sedigh, Amir Hosein Afshar, Purvis, Martin K., Savarimuthu, Bastin Tony Roy, Frantz, Christopher K, Purvis, Maryam A.
This paper presents a conceptual refinement of agent cognitive architecture inspired from the beliefs-desires-intentions (BDI) and the theory of planned behaviour (TPB) models, with an emphasis on different belief facets. This enables us to investigate the impact of personality and the way that an agent weights its internal beliefs and social sanctions on an agent's actions. The study also uses the concept of cognitive dissonance associated with the fairness of institutions to investigate the agents' behaviour. To showcase our model, we simulate two historical long-distance trading societies, namely Armenian merchants of New-Julfa and the English East India Company. The results demonstrate the importance of internal beliefs of agents as a pivotal aspect for following institutional rules.
Intention Propagation for Multi-agent Reinforcement Learning
Qu, Chao, Li, Hui, Liu, Chang, Xiong, Junwu, Zhang, James, Chu, Wei, Qi, Yuan, Song, Le
Collaborative multi-agent reinforcement learning is an important sub-field of the multiagent reinforcement learning (MARL), where the agents learn to coordinate to achieve joint success. It has wide applications in traffic control [Kuyer et al., 2008], autonomous driving [Shalev-Shwartz et al., 2016] and smart grid [Yang et al., 2018]. To learn a coordination, the interactions between agents are indispensable. For instance, humans can reason about other's behaviors or know other peoples' intentions through communication and then determine an effective coordination plan. However, how to design a mechanism of such interaction in a principled way and at the same time solve the large scale real-world applications is still a challenging problem. Recently, there is a surge of interest in solving the collaborative MARL problem [Foerster et al., 2018, Qu et al., 2019, Lowe et al., 2017]. Among them, joint policy approaches have demonstrated their superiority [Rashid et al., 2018, Sunehag et al., 2018, Oliehoek et al., 2016]. A straightforward approach is to replace the action in the single-agent reinforcement learning by the joint action a (a 1, a 2,..., a N), while it obviously suffers from the issue of the exponentially large action space.
Omdena Building AI Solutions Through Global Collaboration
Omdena runs AI projects with organizations that want to get started with Artificial Intelligence, solve a real-world problem, or build deployable solutions within two months. The projects are powered by our unique Collaborative AI processes, which results in fast development, innovation, and trusted solutions through a bottom-up development process. At first, an organization submits a problem or idea. Next, we publicly announce the AI project and select up to 50 engineers that work with the organization to refine the problem statement, collect the data, and build their solutions.
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning
Li, Wenhao, Jin, Bo, Wang, Xiangfeng, Yan, Junchi, Zha, Hongyuan
Traditional centralized multi-agent reinforcement learning (MARL) algorithms are sometimes unpractical in complicated applications, due to non-interactivity between agents, curse of dimensionality and computation complexity. Hence, several decentralized MARL algorithms are motivated. However, existing decentralized methods only handle the fully cooperative setting where massive information needs to be transmitted in training. The block coordinate gradient descent scheme they used for successive independent actor and critic steps can simplify the calculation, but it causes serious bias. In this paper, we propose a flexible fully decentralized actor-critic MARL framework, which can combine most of actor-critic methods, and handle large-scale general cooperative multi-agent setting. A primal-dual hybrid gradient descent type algorithm framework is designed to learn individual agents separately for decentralization. From the perspective of each agent, policy improvement and value evaluation are jointly optimized, which can stabilize multi-agent policy learning. Furthermore, our framework can achieve scalability and stability for large-scale environment and reduce information transmission, by the parameter sharing mechanism and a novel modeling-other-agents methods based on theory-of-mind and online supervised learning. Sufficient experiments in cooperative Multi-agent Particle Environment and StarCraft II show that our decentralized MARL instantiation algorithms perform competitively against conventional centralized and decentralized methods.
Trump's WHO attack accelerates breakdown in global cooperation
U.S. President Donald Trump's broadside against the World Health Organization is another blow to international institutions designed to help nations confront global crises -- and may leave countries even less prepared for the next one. Trump's move on Tuesday to suspend WHO funding amid a pandemic that has cost at least 130,000 lives is the latest salvo in a broader struggle between the U.S. and China over global leadership. Both countries are courting other nations and public opinion as they cover up their own shortcomings in the pandemic and position themselves for the post-virus world. China -- widely criticized for missteps early in the outbreak -- has ramped up efforts to send medical supplies to hard-hit nations, even as reports emerged that much of that gear was faulty or expired. The U.S., meanwhile, announced $300 million in aid to countries fighting the virus but rebuffed requests for the most essential gear while receiving donations from the governments of Egypt, Taiwan and Vietnam among others.
Quantifying the Impact of Non-Stationarity in Reinforcement Learning-Based Traffic Signal Control
Alegre, Lucas N., Bazzan, Ana L. C., da Silva, Bruno C.
In reinforcement learning (RL), dealing with non-stationarity is a challenging issue. However, some domains such as traffic optimization are inherently non-stationary. Causes for and effects of this are manifold. In particular, when dealing with traffic signal controls, addressing non-stationarity is key since traffic conditions change over time and as a function of traffic control decisions taken in other parts of a network. In this paper we analyze the effects that different sources of non-stationarity have in a network of traffic signals, in which each signal is modeled as a learning agent. More precisely, we study both the effects of changing the \textit{context} in which an agent learns (e.g., a change in flow rates experienced by it), as well as the effects of reducing agent observability of the true environment state. Partial observability may cause distinct states (in which distinct actions are optimal) to be seen as the same by the traffic signal agents. This, in turn, may lead to sub-optimal performance. We show that the lack of suitable sensors to provide a representative observation of the real state seems to affect the performance more drastically than the changes to the underlying traffic patterns.
Finance experts note importance of workforce diversity, global collaboration
Against a backdrop of startling international developments, such as Brexit and the Hong Kong protests, Japan's financial sector is uniquely positioned to step out of the shadows of its competitors in Singapore and Hong Kong. This is the assessment of The Organization of Global Financial City Tokyo -- also known as FinCity.Tokyo -- which, on March 19, held its FinCity Global Forum at the Grand Hyatt Tokyo in Roppongi to explore the opportunities and challenges that await Japan in its pursuit to become a top global financial hub. Established in April 2019, FinCity.Tokyo is an organization that promotes Tokyo as a global financial hub and supports foreign financial services firms set up in Tokyo. In addition to the keynote and other speeches, the forum consisted of a series of panel discussions that invited industry veterans to discuss a wide array of topics, ranging from regional revitalization and socially oriented asset management to competition and collaboration among international financial cities. The first panel, centered on the theme of "Advancement of the Asset Management Industry and Global Financial City Initiative," invited panelists Yasumasa Tahara, director of the strategy development division at the Financial Services Agency; Kazuhide Toda, managing executive officer and chief investment officer at Nippon Life Insurance Company; and Oki Matsumoto, chairman and CEO at Monex Group Inc., to share their thoughts on how the industry can improve its asset management environment.
Networked Multi-Agent Reinforcement Learning with Emergent Communication
Gupta, Shubham, Hazra, Rishi, Dukkipati, Ambedkar
Multi-Agent Reinforcement Learning (MARL) methods find optimal policies for agents that operate in the presence of other learning agents. Central to achieving this is how the agents coordinate. One way to coordinate is by learning to communicate with each other. Can the agents develop a language while learning to perform a common task? In this paper, we formulate and study a MARL problem where cooperative agents are connected to each other via a fixed underlying network. These agents can communicate along the edges of this network by exchanging discrete symbols. However, the semantics of these symbols are not predefined and, during training, the agents are required to develop a language that helps them in accomplishing their goals. We propose a method for training these agents using emergent communication. We demonstrate the applicability of the proposed framework by applying it to the problem of managing traffic controllers, where we achieve state-of-the-art performance as compared to a number of strong baselines. More importantly, we perform a detailed analysis of the emergent communication to show, for instance, that the developed language is grounded and demonstrate its relationship with the underlying network topology. To the best of our knowledge, this is the only work that performs an in depth analysis of emergent communication in a networked MARL setting while being applicable to a broad class of problems.
A Norm Emergence Framework for Normative MAS -- Position Paper
Morris-Martin, Andreasa, De Vos, Marina, Padget, Julian
Norm emergence is typically studied in the context of multiagent systems (MAS) where norms are implicit, and participating agents use simplistic decision-making mechanisms. These implicit norms are usually unconsciously shared and adopted through agent interaction. A norm is deemed to have emerged when a threshold or predetermined percentage of agents follow the "norm". Conversely, in normative MAS, norms are typically explicit and agents deliberately share norms through communication or are informed about norms by an authority, following which an agent decides whether to adopt the norm or not. The decision to adopt a norm by the agent can happen immediately after recognition or when an applicable situation arises. In this paper, we make the case that, similarly, a norm has emerged in a normative MAS when a percentage of agents adopt the norm. Furthermore, we posit that agents themselves can and should be involved in norm synthesis, and hence influence the norms governing the MAS, in line with Ostrom's eight principles. Consequently, we put forward a framework for the emergence of norms within a normative MAS, that allows participating agents to propose/request changes to the normative system, while special-purpose synthesizer agents formulate new norms or revisions in response to these requests. Synthesizers must collectively agree that the new norm or norm revision should proceed, and then finally be approved by an "Oracle". The normative system is then modified to incorporate the norm.
A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control
Ghosh, Supriyo, Laguna, Sean, Lim, Shiau Hong, Wynter, Laura, Poonawala, Hasan
Air traffic control is an example of a highly challenging operational problem that is readily amenable to human expertise augmentation via decision support technologies. In this paper, we propose a new intelligent decision making framework that leverages multi-agent reinforcement learning (MARL) to dynamically suggest adjustments of aircraft speeds in real-time. The goal of the system is to enhance the ability of an air traffic controller to provide effective guidance to aircraft to avoid air traffic congestion, near-miss situations, and to improve arrival timeliness. We develop a novel deep ensemble MARL method that can concisely capture the complexity of the air traffic control problem by learning to efficiently arbitrate between the decisions of a local kernel-based RL model and a wider-reaching deep MARL model. The proposed method is trained and evaluated on an open-source air traffic management simulator developed by Eurocontrol. Extensive empirical results on a real-world dataset including thousands of aircraft demonstrate the feasibility of using multi-agent RL for the problem of en-route air traffic control and show that our proposed deep ensemble MARL method significantly outperforms three state-of-the-art benchmark approaches.