Agents
Dynamic Games among Teams with Delayed Intra-Team Information Sharing
Tang, Dengwang, Tavafoghi, Hamidreza, Subramanian, Vijay, Nayyar, Ashutosh, Teneketzis, Demosthenis
We analyze a class of stochastic dynamic games among teams with asymmetric information, where members of a team share their observations internally with a delay of $d$. Each team is associated with a controlled Markov Chain, whose dynamics are coupled through the players' actions. These games exhibit challenges in both theory and practice due to the presence of signaling and the increasing domain of information over time. We develop a general approach to characterize a subset of Nash Equilibria where the agents can use a compressed version of their information, instead of the full information, to choose their actions. We identify two subclasses of strategies: Sufficient Private Information Based (SPIB) strategies, which only compress private information, and Compressed Information Based (CIB) strategies, which compress both common and private information. We show that while SPIB-strategy-based equilibria always exist, the same is not true for CIB-strategy-based equilibria. We develop a backward inductive sequential procedure, whose solution (if it exists) provides a CIB strategy-based equilibrium. We identify some instances where we can guarantee the existence of a solution to the above procedure. Our results highlight the tension among compression of information, existence of (compression based) equilibria, and backward inductive sequential computation of such equilibria in stochastic dynamic games with asymmetric information.
Inferring urban social networks from publicly available data
Guarino, Stefano, Mastrostefano, Enrico, Bernaschi, Massimo, Celestini, Alessandro, Cianfriglia, Marco, Torre, Davide, Zastrow, Lena
Defining accurate models for real-world social networks is instrumental in several research fields, e.g., in sociology [1], epidemiology [2] or marketing [3]. In combination with computer simulations these models may represent a valuable tool to understand social phenomena, along with classic analytical studies. Dynamic processes, such as the spread of a disease or a rumour, can be represented upon suitable networks that encode the patterns of connection and interaction among the individuals of a population. Moreover, the comparison of synthetic networks produced by different generative models helps to infer how each factor contributes to the emergence of experimentally measured properties of real networks [4]. In this paper, we present a novel computational model for urban social networks, that combines a data-driven framework with a set of adjustable parameters. A fully operational open source implementation of the model is available under the GPL v3 at gitlab.com/cranic-group/usn. The software allows to generate a synthetic social network of "strong ties" [5] among geo-referenced and age-stratified individuals. The graph encodes information on the urban social fabric and, as such, it increases the plausibility of dynamic (e.g., transmission) processes that may be influenced by preferences and actions of agents and groups of related agents. On the one hand, our social graph may be used to simulate the fact that friends and relatives may go out together, organize public or private meetings, and are, in general, more likely to interact.
An active inference model of collective intelligence
Kaufmann, Rafael, Gupta, Pranav, Taylor, Jacob
To date, formal models of collective intelligence have lacked a plausible mathematical description of the relationship between local-scale interactions between highly autonomous sub-system components (individuals) and global-scale behavior of the composite system (the collective). In this paper we use the Active Inference Formulation (AIF), a framework for explaining the behavior of any non-equilibrium steady state system at any scale, to posit a minimal agent-based model that simulates the relationship between local individual-level interaction and collective intelligence (operationalized as system-level performance). We explore the effects of providing baseline AIF agents (Model 1) with specific cognitive capabilities: Theory of Mind (Model 2); Goal Alignment (Model 3), and Theory of Mind with Goal Alignment (Model 4). These stepwise transitions in sophistication of cognitive ability are motivated by the types of advancements plausibly required for an AIF agent to persist and flourish in an environment populated by other AIF agents, and have also recently been shown to map naturally to canonical steps in human cognitive ability. Illustrative results show that stepwise cognitive transitions increase system performance by providing complementary mechanisms for alignment between agents' local and global optima. Alignment emerges endogenously from the dynamics of interacting AIF agents themselves, rather than being imposed exogenously by incentives to agents' behaviors (contra existing computational models of collective intelligence) or top-down priors for collective behavior (contra existing multiscale simulations of AIF). These results shed light on the types of generic information-theoretic patterns conducive to collective intelligence in human and other complex adaptive systems.
Cresta, which uses AI to mentor customer service agents in real time, raises $50M
Cresta, an AI-powered platform that offers real-time support to help customer service agents respond to inquiries on calls or in chats, has raised $50 million in a series B round of funding. The company's latest investment, which was led by Sequoia Capital, with participation from Greylock Partners, Andreessen Horowitz, Allen & Company, and Porsche Ventures, comes after a year of growth that saw its revenues quadruple. It's difficult to read too much into any first-year revenue growth metrics, but it's clear that companies are hankering for technology that helps them optimize their customer-facing operations. Contact centers have proven fertile ground for AI, with a slew of companies emerging to offer their own take on how automation can improve companies' interactions with their customers. Just today, Uniphore announced a fresh $140 million investment to analyze emotion and engagement in both voice and video-based calls, while Talkdesk launched a new "human-in-the-loop" AI trainer for contact centers.
Deep Reinforcement Learning for Constrained Field Development Optimization in Subsurface Two-phase Flow
Nasir, Yusuf, He, Jincong, Hu, Chaoshun, Tanaka, Shusei, Wang, Kainan, Wen, XianHuan
We present a deep reinforcement learning-based artificial intelligence agent that could provide optimized development plans given a basic description of the reservoir and rock/fluid properties with minimal computational cost. This artificial intelligence agent, comprising of a convolutional neural network, provides a mapping from a given state of the reservoir model, constraints, and economic condition to the optimal decision (drill/do not drill and well location) to be taken in the next stage of the defined sequential field development planning process. The state of the reservoir model is defined using parameters that appear in the governing equations of the two-phase flow. A feedback loop training process referred to as deep reinforcement learning is used to train an artificial intelligence agent with such a capability. The training entails millions of flow simulations with varying reservoir model descriptions (structural, rock and fluid properties), operational constraints, and economic conditions. The parameters that define the reservoir model, operational constraints, and economic conditions are randomly sampled from a defined range of applicability. Several algorithmic treatments are introduced to enhance the training of the artificial intelligence agent. After appropriate training, the artificial intelligence agent provides an optimized field development plan instantly for new scenarios within the defined range of applicability. This approach has advantages over traditional optimization algorithms (e.g., particle swarm optimization, genetic algorithm) that are generally used to find a solution for a specific field development scenario and typically not generalizable to different scenarios.
Solving Heterogeneous General Equilibrium Economic Models with Deep Reinforcement Learning
Hill, Edward, Bardoscia, Marco, Turrell, Arthur
General equilibrium macroeconomic models are a core tool used by policymakers to understand a nation's economy. They represent the economy as a collection of forward-looking actors whose behaviours combine, possibly with stochastic effects, to determine global variables (such as prices) in a dynamic equilibrium. However, standard semi-analytical techniques for solving these models make it difficult to include the important effects of heterogeneous economic actors. The COVID-19 pandemic has further highlighted the importance of heterogeneity, for example in age and sector of employment, in macroeconomic outcomes and the need for models that can more easily incorporate it. We use techniques from reinforcement learning to solve such models incorporating heterogeneous agents in a way that is simple, extensible, and computationally efficient. We demonstrate the method's accuracy and stability on a toy problem for which there is a known analytical solution, its versatility by solving a general equilibrium problem that includes global stochasticity, and its flexibility by solving a combined macroeconomic and epidemiological model to explore the economic and health implications of a pandemic. The latter successfully captures plausible economic behaviours induced by differential health risks by age.
Autonomous systems and the 2nd industrial revolution.
We've all seen the super cool demonstrations of Boston Dynamics' two-legged Atlas robot doing backflips, and been properly amazed! Today, an article about a new robot from Boston Dynamics caught my eye. Stretch is designed for logistics applications. It lifts and moves boxes in warehouses. Unlike traditional industrial robots, it isn't fixed in place, but is mobile instead.
Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments
Han, Muzhi, Zhang, Zeyu, Jiao, Ziyuan, Xie, Xu, Zhu, Yixin, Zhu, Song-Chun, Liu, Hangxin
In this paper, we rethink the problem of scene reconstruction from an embodied agent's perspective: While the classic view focuses on the reconstruction accuracy, our new perspective emphasizes the underlying functions and constraints such that the reconstructed scenes provide \em{actionable} information for simulating \em{interactions} with agents. Here, we address this challenging problem by reconstructing an interactive scene using RGB-D data stream, which captures (i) the semantics and geometry of objects and layouts by a 3D volumetric panoptic mapping module, and (ii) object affordance and contextual relations by reasoning over physical common sense among objects, organized by a graph-based scene representation. Crucially, this reconstructed scene replaces the object meshes in the dense panoptic map with part-based articulated CAD models for finer-grained robot interactions. In the experiments, we demonstrate that (i) our panoptic mapping module outperforms previous state-of-the-art methods, (ii) a high-performant physical reasoning procedure that matches, aligns, and replaces objects' meshes with best-fitted CAD models, and (iii) reconstructed scenes are physically plausible and naturally afford actionable interactions; without any manual labeling, they are seamlessly imported to ROS-based simulators and virtual environments for complex robot task executions.
Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World
Laurent, Florian, Schneider, Manuel, Scheller, Christian, Watson, Jeremy, Li, Jiaoyang, Chen, Zhe, Zheng, Yi, Chan, Shao-Hung, Makhnev, Konstantin, Svidchenko, Oleg, Egorov, Vladimir, Ivanov, Dmitry, Shpilman, Aleksei, Spirovska, Evgenija, Tanevski, Oliver, Nikov, Aleksandar, Grunder, Ramon, Galevski, David, Mitrovski, Jakov, Sartoretti, Guillaume, Luo, Zhiyao, Damani, Mehul, Bhattacharya, Nilabha, Agarwal, Shivam, Egli, Adrian, Nygren, Erik, Mohanty, Sharada
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operations research (OR) for decades, the ever-growing complexity of modern railway networks makes dynamic real-time scheduling of traffic virtually impossible. Recently, multi-agent reinforcement learning (MARL) has successfully tackled challenging tasks where many agents need to be coordinated, such as multiplayer video games. However, the coordination of hundreds of agents in a real-life setting like a railway network remains challenging and the Flatland environment used for the competition models these real-world properties in a simplified manner. Submissions had to bring as many trains (agents) to their target stations in as little time as possible. While the best submissions were in the OR category, participants found many promising MARL approaches. Using both centralized and decentralized learning based approaches, top submissions used graph representations of the environment to construct tree-based observations. Further, different coordination mechanisms were implemented, such as communication and prioritization between agents. This paper presents the competition setup, four outstanding solutions to the competition, and a cross-comparison between them.
User profile-driven large-scale multi-agent learning from demonstration in federated human-robot collaborative environments
Papadopoulos, Georgios Th., Leonidis, Asterios, Antona, Margherita, Stephanidis, Constantine
Learning from Demonstration (LfD) has been established as the dominant paradigm for efficiently transferring skills from human teachers to robots. In this context, the Federated Learning (FL) conceptualization has very recently been introduced for developing large-scale human-robot collaborative environments, targeting to robustly address, among others, the critical challenges of multi-agent learning and long-term autonomy. In the current work, the latter scheme is further extended and enhanced, by designing and integrating a novel user profile formulation for providing a fine-grained representation of the exhibited human behavior, adopting a Deep Learning (DL)-based formalism. In particular, a hierarchically organized set of key information sources is considered, including: a) User attributes (e.g. demographic, anthropomorphic, educational, etc.), b) User state (e.g. fatigue detection, stress detection, emotion recognition, etc.) and c) Psychophysiological measurements (e.g. gaze, electrodermal activity, heart rate, etc.) related data. Then, a combination of Long Short-Term Memory (LSTM) and stacked autoencoders, with appropriately defined neural network architectures, is employed for the modelling step. The overall designed scheme enables both short- and long-term analysis/interpretation of the human behavior (as observed during the feedback capturing sessions), so as to adaptively adjust the importance of the collected feedback samples when aggregating information originating from the same and different human teachers, respectively.