Agents
Multi-Agent Advisor Q-Learning
Ganapathi Subramanian, Sriram (U Waterloo) | Taylor, Matthew E. (University of Alberta) | Larson, Kate (University of Waterloo) | Crowley, Mark (University of Waterloo)
In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable policies, that need to be overcome before wide-spread deployment is possible. However, many real-world environments already, in practice, deploy sub-optimal or heuristic approaches for generating policies. An interesting question that arises is how to best use such approaches as advisors to help improve reinforcement learning in multi-agent domains. In this paper, we provide a principled framework for incorporating action recommendations from online suboptimal advisors in multi-agent settings. We describe the problem of ADvising Multiple Intelligent Reinforcement Agents (ADMIRAL) in nonrestrictive general-sum stochastic game environments and present two novel Q-learning based algorithms: ADMIRAL - Decision Making (ADMIRAL-DM) and ADMIRAL - Advisor Evaluation (ADMIRAL-AE), which allow us to improve learning by appropriately incorporating advice from an advisor (ADMIRAL-DM), and evaluate the effectiveness of an advisor (ADMIRAL-AE). We analyze the algorithms theoretically and provide fixed point guarantees regarding their learning in general-sum stochastic games. Furthermore, extensive experiments illustrate that these algorithms: can be used in a variety of environments, have performances that compare favourably to other related baselines, can scale to large state-action spaces, and are robust to poor advice from advisors.
Estimation of Standard Auction Models
Cherapanamjeri, Yeshwanth, Daskalakis, Constantinos, Ilyas, Andrew, Zampetakis, Manolis
Estimating value and/or bid distributions from an observed sequence of auctions is a fundamental challenge in Econometrics with direct practical applic ations. For example, these fundamentals allow one to analyze the performance of an auction and make co unterfactual predictions about alternatives. The difficulty of this problem depends on the fo rmat of the auctions and the structure of the observed information from each one, as well as how the fundamentals of bidders are interrelated and vary across the sequence of observations. In this paper, we study a basic version of the afore-describe d estimation challenge, wherein the auction format and the bidder distributions stay fixed across observations, and the bidders have independent private values (which are independently resam pled across different observations). The auction formats that we consider are first-and second-pri ce auctions, as well as Dutch and English auctions. What will make our problem challenging is that (i) our bidders are ex ante asymmetric, drawing their independent private values from different distributions; (ii) we will make no parametric assumptions about these distributions; and (iii) we will only be observing the 1 identity of the winner and the price they paid but not the losi ng bids. Under this observational model and our independent private values assumption above, we can focus our attention on first-and second-price auctions, and our results automatically e xtend to Dutch and English auctions. In the above settings, we give computationally and sample ef ficient methods for estimating all agents' bid distributions and (under equilibrium assumpti ons) value distributions: In the case of first-price auctions, we provide finite-sample es timation guarantees under L evy, Kolmogorov and T otal V ariation distance with minimal assumptions. Under (a condition weaker than) a lower bound on the density of the bid dis tributions (although we actually do not need existence of densities), Theorem 2.2 shows that the bid distributions can be estimated to within ε in L evy distance, using 1/ ε
Optimal preference satisfaction for conflict-free joint decisions
Shinkawa, Hiroaki, Chauvet, Nicolas, Bachelier, Guillaume, Röhm, André, Horisaki, Ryoichi, Naruse, Makoto
We all have preferences when multiple choices are available. If we insist on satisfying our preferences only, we may suffer a loss due to conflicts with other people's identical selections. Such a case applies when the choice cannot be divided into multiple pieces due to the intrinsic nature of the resources. Former studies, such as the top trading cycle, examined how to conduct fair joint decision-making while avoiding decision conflicts from the perspective of game theory when multiple players have their own deterministic preference profiles. However, in reality, probabilistic preferences can naturally appear in relation to the stochastic decision-making of humans. Here, we theoretically derive conflict-free joint decision-making that can satisfy the probabilistic preferences of all individual players. More specifically, we mathematically prove the conditions wherein the deviation of the resultant chance of obtaining each choice from the individual preference profile, which we call the loss, becomes zero, meaning that all players' satisfaction is perfectly appreciated while avoiding decision conflicts. Furthermore, even in situations where zero-loss conflict-free joint decision-making is unachievable, we show how to derive joint decision-making that accomplishes the theoretical minimum loss while ensuring conflict-free choices. Numerical demonstrations are also shown with several benchmarks.
Features of a smart city
A smart city is a city that uses technology to provide services and solve city problems. The main goals of a smart city are to improve policy efficiency, reduce waste and inconvenience, improve social and economic quality, and maximize social inclusion. Due to the breadth of technologies that have been implemented under the smart city label, it is difficult to distill a precise definition of a smart city. As the world's population continues to urbanize – by 2050, 66% of the world's population is expected to be urban – there is a global trend toward the creation of smart cities. This tendency not only causes many physical, social, behavioural, economic, and infrastructure issues, but it also creates many opportunities.
Yellow.ai launches low-code digital agents for swift deployment
Yellow.ai, which offers automation across customer engagement, support and conversational commerce for enterprises, has announced the availability of pre-built Dynamic AI Agents for rapid deployment across a number of verticals. The agents are designed to connect conversations across voice, text and chat, in multiple languages. The agents, which will be available in Yellow.ai's Agents are also available to enhance employee experience by automating HR processes like onboarding and training, and IT management services. We seem to stand on the brink of a working world in which everything is automated, both for employees and customers.
Learning Anisotropic Interaction Rules from Individual Trajectories in a Heterogeneous Cellular Population
Messenger, Daniel A., Wheeler, Graycen E., Liu, Xuedong, Bortz, David M.
Interacting particle system (IPS) models have proven to be highly successful for describing the spatial movement of organisms. However, it has proven challenging to infer the interaction rules directly from data. In the field of equation discovery, the Weak form Sparse Identification of Nonlinear Dynamics (WSINDy) methodology has been shown to be very computationally efficient for identifying the governing equations of complex systems, even in the presence of substantial noise. Motivated by the success of IPS models to describe the spatial movement of organisms, we develop WSINDy for second order IPSs to model the movement of communities of cells. Specifically, our approach learns the directional interaction rules that govern the dynamics of a heterogeneous population of migrating cells. Rather than aggregating cellular trajectory data into a single best-fit model, we learn the models for each individual cell. These models can then be efficiently classified according to the active classes of interactions present in the model. From these classifications, aggregated models are constructed hierarchically to simultaneously identify different species of cells present in the population and determine best-fit models for each species. We demonstrate the efficiency and proficiency of the method on several test scenarios, motivated by common cell migration experiments.
GitHub - google-research/recsim_ng: RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems
RecSimNG is a scalable, modular, differentiable simulator implemented in Edward2 and TensorFlow. It offers: a powerful, general probabilistic programming language for agent-behavior specification; an XLA-based vectorized execution model for running simulations on accelerated hardware; and tools for probabilistic inference and latent-variable model learning, backed by automatic differentiation and tracing. We describe RecSim NG and illustrate how it can be used to create transparent, configurable, end-to-end models of a recommender ecosystem. Specifically, we present a collection of use cases that demonstrate how the functionality described above can help both researchers and practitioners easily develop and train novel algorithms for recommender systems. Please cite the paper if you use the code from this repository in your work. This is not an officially supported Google product.
Embracing AWKWARD! Real-time Adjustment of Reactive Planning Using Social Norms
This paper presents the AWKWARD agent architecture for the development of agents in Multi-Agent Systems. AWKWARD agents can have their plans re-configured in real time to align with social role requirements under changing environmental and social circumstances. The proposed hybrid architecture makes use of Behaviour Oriented Design (BOD) to develop agents with reactive planning and of the well-established OperA framework to provide organisational, social, and interaction definitions in order to validate and adjust agents' behaviours. Together, OperA and BOD can achieve real-time adjustment of agent plans for evolving social roles, while providing the additional benefit of transparency into the interactions that drive this behavioural change in individual agents. We present this architecture to motivate the bridging between traditional symbolic- and behaviour-based AI communities, where such combined solutions can help MAS researchers in their pursuit of building stronger, more robust intelligent agent teams.
On automatic calibration of the SIRD epidemiological model for COVID-19 data in Poland
Błaszczyk, Piotr, Klimczak, Konrad, Mahdi, Adam, Oprocha, Piotr, Potorski, Paweł, Przybyłowicz, Paweł, Sobieraj, Michał
We propose a novel methodology for estimating the epidemiological parameters of a modified SIRD model (acronym of Susceptible, Infected, Recovered and Deceased individuals) and perform a short-term forecast of SARS-CoV-2 virus spread. We mainly focus on forecasting number of deceased. The procedure was tested on reported data for Poland. For some short-time intervals we performed numerical test investigating stability of parameter estimates in the proposed approach. Numerical experiments confirm the effectiveness of short-term forecasts (up to 2 weeks) and stability of the method. To improve their performance (i.e.
Cooperative Manipulation via Internal Force Regulation: A Rigidity Theory Perspective
Verginis, Christos K., Zelazo, Daniel, Dimarogonas, Dimos V.
This paper considers the integration of rigid cooperative manipulation with rigidity theory. Motivated by rigid models of cooperative manipulation systems, i.e., where the grasping contacts are rigid, we introduce first the notion of bearing and distance rigidity for graph frameworks in SE(3). Next, we associate the nodes of these frameworks to the robotic agents of rigid cooperative manipulation schemes and we express the object-agent interaction forces by using the graph rigidity matrix, which encodes the infinitesimal rigid body motions of the system. Moreover, we show that the associated cooperative manipulation grasp matrix is related to the rigidity matrix via a range-nullspace relation, based on which we provide novel results on the relation between the arising interaction and internal forces and consequently on the energy-optimal force distribution on a cooperative manipulation system. Finally, simulation results on a realistic environment enhance the validity of the theoretical findings.