Agents
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping
Bargiacchi, Eugenio, Verstraeten, Timothy, Roijers, Diederik M., Nowé, Ann
We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our approach only requires knowledge about the structure of the problem in the form of a dynamic decision network. Using this information, our method learns a model of the environment and performs temporal difference updates which affect multiple joint states and actions at once. Batch updates are additionally performed which efficiently back-propagate knowledge throughout the factored Q-function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.
Inducing Cooperation in Multi-Agent Games Through Status-Quo Loss
Badjatiya, Pinkesh, Sarkar, Mausoom, Sinha, Abhishek, Singh, Siddharth, Puri, Nikaash, Krishnamurthy, Balaji
Social dilemma situations bring out the conflict between individual and group rationality. When individuals act rationally in such situations, the group suffers sub-optimal outcomes. The Iterative Prisoner's Dilemma (IPD) is a two-player game that offers a theoretical framework to model and study such social situations. In the Prisoner's Dilemma, individualistic behavior leads to mutual defection and sub-optimal outcomes. This result is in contrast to what one observes in human groups, where humans often sacrifice individualistic behavior for the good of the collective. It is interesting to study how and why such cooperative and individually irrational behavior emerges in human groups. To this end, recent work models this problem by treating each player as a Deep Reinforcement Learning (RL) agent and evolves cooperative behavioral policies through internal information or reward sharing mechanisms. We propose an approach to evolve cooperative behavior between RL agents playing the IPD game without sharing rewards, internal details (weights, gradients), or a communication channel. We introduce a Status-Quo loss (SQLoss) that incentivizes cooperative behavior by encouraging policy stationarity. We also describe an approach to transform a two-player game (with visual inputs) into its IPD formulation through self-supervised skill discovery (IPDistill).We show how our approach outperforms existing approaches in the Iterative Prisoner's Dilemma and the two-player Coin game.
Distributed Learning in the Non-Convex World: From Batch to Streaming Data, and Beyond
Chang, Tsung-Hui, Hong, Mingyi, Wai, Hoi-To, Zhang, Xinwei, Lu, Songtao
Distributed learning has become a critical enabler of the massively connected world envisioned by many. This article discusses four key elements of scalable distributed processing and real-time intelligence -- problems, data, communication and computation. Our aim is to provide a fresh and unique perspective about how these elements should work together in an effective and coherent manner. In particular, we provide a selective review about the recent techniques developed for optimizing non-convex models (i.e., problem classes), processing batch and streaming data (i.e., data types), over the networks in a distributed manner (i.e., communication and computation paradigm). We describe the intuitions and connections behind a core set of popular distributed algorithms, emphasizing how to trade off between computation and communication costs. Practical issues and future research directions will also be discussed. We are living in a highly connected world, and it will become exponentially more connected in a decade. These devices collect a huge amount of real-time data, perform complex computational tasks, and provide vital services which significantly improve our lives and enrich our collective productivity. THC, MH, HTW are ordered alphabetically, and contributed equally. MH is the corresponding author. THC is with the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China. MH and XZ are with the ECE Department, University of Minnesota, MN, USA. HTW is with the Department of SEEM, The Chinese University of Hong Kong, Hong Kong SAR, China. SL is with IBM Research AI, IBM Thomas J. Watson Research Center Y orktown Heights, New Y ork 10598, USA.
Epistemic Graphs for Representing and Reasoning with Positive and Negative Influences of Arguments
Hunter, Anthony, Polberg, Sylwia, Thimm, Matthias
This paper introduces epistemic graphs as a generalization of the epistemic approach to probabilistic argumentation. In these graphs, an argument can be believed or disbelieved up to a given degree, thus providing a more fine--grained alternative to the standard Dung's approaches when it comes to determining the status of a given argument. Furthermore, the flexibility of the epistemic approach allows us to both model the rationale behind the existing semantics as well as completely deviate from them when required. Epistemic graphs can model both attack and support as well as relations that are neither support nor attack. The way other arguments influence a given argument is expressed by the epistemic constraints that can restrict the belief we have in an argument with a varying degree of specificity. The fact that we can specify the rules under which arguments should be evaluated and we can include constraints between unrelated arguments permits the framework to be more context--sensitive. It also allows for better modelling of imperfect agents, which can be important in multi--agent applications.
A Unified Conversational Assistant Framework for Business Process Automation
Business process automation is a booming multi-billion-dollar industry that promises to remove menial tasks from workers' plates – through the introduction of autonomous agents – and free up their time and brain power for more creative and engaging tasks. However, an essential component to the successful deployment of such autonomous agents is the ability of business users to monitor their performance and customize their execution. A simple and user-friendly interface with a low learning curve is necessary to increase the adoption of such agents in banking, insurance, retail and other domains. As a result, proactive chatbots will play a crucial role in the business automation space. Not only can they respond to users' queries and perform actions on their behalf but also initiate communication with the users to inform them of the system's behavior.
Perspectives and Ethics of the Autonomous Artificial Thinking Systems
The feasibility of autonomous artificial thinking systems needs to compare the way th e human beings acquire their information and develops the thought with the current capacities of the autonomous information systems. Our model uses four hierarchies: the hierarchy of information systems, the cognitive hierarchy, the linguistic hierarchy and t he digital informative hierarchy that combines artificial intelligence, the power of computers models, methods and tools to develop autonomous information systems. The question of the capability of autonomous system to provide a form of artificial thought arises with the ethical consequences on the social life and the perspec tive of transhumanism.
Exploiting Language Instructions for Interpretable and Compositional Reinforcement Learning
van der Meer, Michiel, Pirotta, Matteo, Bruni, Elia
In this work, we present an alternative approach to making an agent compositional through the use of a diagnostic classifier. Because of the need for explainable agents in automated decision processes, we attempt to interpret the latent space from an RL agent to identify its current objective in a complex language instruction. Results show that the classification process causes changes in the hidden states which makes them more easily interpretable, but also causes a shift in zero-shot performance to novel instructions. Lastly, we limit the supervisory signal on the classification, and observe a similar but less notable effect.
Decentralized Optimization of Vehicle Route Planning -- A Cross-City Comparative Study
Davis, Brionna, Jennings, Grace, Pothast, Taylor, Gerostathopoulos, Ilias, Pournaras, Evangelos, Stern, Raphael E.
New mobility concepts are at the forefront of research and innovation in smart cities. The introduction of connected and autonomous vehicles enables new possibilities in vehicle routing. Specifically, knowing the origin and destination of each agent in the network can allow for real-time routing of the vehicles to optimize network performance. However, this relies on individual vehicles being "altruistic" i.e., being willing to accept an alternative non-preferred route in order to achieve a network-level performance goal. In this work, we conduct a study to compare different levels of agent altruism and the resulting effect on the network-level traffic performance. Specifically, this study compares the effects of different underlying urban structures on the overall network performance, and investigates which characteristics of the network make it possible to realize routing improvements using a decentralized optimization router. The main finding is that, with increased vehicle altruism, it is possible to balance traffic flow among the links of the network. We show evidence that the decentralized optimization router is more effective with networks of high load while we study the influence of cities characteristics, in particular: networks with a higher number of nodes (intersections) or edges (roads) per unit area allow for more possible alternate routes, and thus higher potential to improve network performance.
How virtual agents transform the customer experience - Dynamics 365 Blog
We often hear the phrases customer experience and customer engagement used interchangeably. But these terms have completely separate meanings. Customer experience is a single event with the customer--a service issue, a promotion, a survey. Customer engagement is a collection of customer experiences that impact engagement such as loyalty, advocacy, and so on. Both customer experience and customer engagement are significant to a company's overall success.
AI can predict your future behaviour with powerful new simulations
The US presidential election campaign is in its final days. Donald Trump is behind in the polls and the pundits are predicting a win for his Democrat challenger, former vice president Joe Biden. He boasts that he will win again. With two weeks to go, his campaign unleashes an offensive in the crucial swing states: adverts, Facebook posts, WhatsApp groups and tweets. They warn of violent crime and civil unrest driven by immigrants and gangs, playing up Trump's endorsement by evangelicals and smearing Biden as a closet atheist. The initiative works and Trump snatches another unlikely victory.