Goto

Collaborating Authors

 Agents


Mean-Field Control Approach to Decentralized Stochastic Control with Finite-Dimensional Memories

arXiv.org Artificial Intelligence

Decentralized stochastic control (DSC) considers the optimal control problem of a multi-agent system. However, DSC cannot be solved except in the special cases because the estimation among the agents is generally intractable. In this work, we propose memory-limited DSC (ML-DSC), in which each agent compresses the observation history into the finite-dimensional memory. Because this compression simplifies the estimation among the agents, ML-DSC can be solved in more general cases based on the mean-field control theory. We demonstrate ML-DSC in the general LQG problem. Because estimation and control are not clearly separated in the general LQG problem, the Riccati equation is modified to the decentralized Riccati equation, which improves estimation as well as control. Our numerical experiment shows that the decentralized Riccati equation is superior to the conventional Riccati equation.


Empirically grounded agent-based policy evaluation of the adoption of sustainable lighting under the European Ecodesign Directive

arXiv.org Artificial Intelligence

Twelve years ago, the European Union began with the gradual phase-out of energy-inefficient incandescent light bulbs under the Ecodesign Directive. In this work, we implement an agent-based simulation to model the consumer behaviour in the EU lighting market with the goal to explain consumer behaviour and explore alternative policies. Agents are based on the Consumat II model, have individual preferences based on empirical market research, gather experience from past actions, and socially interact with each other in a dynamic environment. Our findings suggest that the adoption of energy-friendly lighting alternatives was hindered by a low level of consumer interest combined with high-enough levels of satisfaction about incandescent bulbs and that information campaigns can partially address this. These findings offer insight into both individual-level driving forces of behaviour and society-level outcomes in a niche market. With this, our work demonstrates the strengths of agent-based models for policy generation and evaluation.


Social-PatteRNN: Socially-Aware Trajectory Prediction Guided by Motion Patterns

arXiv.org Artificial Intelligence

As robots across domains start collaborating with humans in shared environments, algorithms that enable them to reason over human intent are important to achieve safe interplay. In our work, we study human intent through the problem of predicting trajectories in dynamic environments. We explore domains where navigation guidelines are relatively strictly defined but not clearly marked in their physical environments. We hypothesize that within these domains, agents tend to exhibit short-term motion patterns that reveal context information related to the agent's general direction, intermediate goals and rules of motion, e.g., social behavior. From this intuition, we propose Social-PatteRNN, an algorithm for recurrent, multi-modal trajectory prediction that exploits motion patterns to encode the aforesaid contexts. Our approach guides long-term trajectory prediction by learning to predict short-term motion patterns. It then extracts sub-goal information from the patterns and aggregates it as social context. We assess our approach across three domains: humans crowds, humans in sports and manned aircraft in terminal airspace, achieving state-of-the-art performance.


Resource Allocation to Agents with Restrictions: Maximizing Likelihood with Minimum Compromise

arXiv.org Artificial Intelligence

Many scenarios where agents with restrictions compete for resources can be cast as maximum matching problems on bipartite graphs. Our focus is on resource allocation problems where agents may have restrictions that make them incompatible with some resources. We assume that a Principle chooses a maximum matching randomly so that each agent is matched to a resource with some probability. Agents would like to improve their chances of being matched by modifying their restrictions within certain limits. The Principle's goal is to advise an unsatisfied agent to relax its restrictions so that the total cost of relaxation is within a budget (chosen by the agent) and the increase in the probability of being assigned a resource is maximized. We establish hardness results for some variants of this budget-constrained maximization problem and present algorithmic results for other variants. We experimentally evaluate our methods on synthetic datasets as well as on two novel real-world datasets: a vacation activities dataset and a classrooms dataset.


Efficient Customer Service Combining Human Operators and Virtual Agents

arXiv.org Artificial Intelligence

The prospect of combining human operators and virtual agents (bots) into an effective hybrid system that provides proper customer service to clients is promising yet challenging. The hybrid system decreases the customers' frustration when bots are unable to provide appropriate service and increases their satisfaction when they prefer to interact with human operators. Furthermore, we show that it is possible to decrease the cost and efforts of building and maintaining such virtual agents by enabling the virtual agent to incrementally learn from the human operators. We employ queuing theory to identify the key parameters that govern the behavior and efficiency of such hybrid systems and determine the main parameters that should be optimized in order to improve the service. We formally prove, and demonstrate in extensive simulations and in a user study, that with the proper choice of parameters, such hybrid systems are able to increase the number of served clients while simultaneously decreasing their expected waiting time and increasing satisfaction.


2-D Directed Formation Control Based on Bipolar Coordinates

arXiv.org Artificial Intelligence

This work proposes a novel 2-D formation control scheme for acyclic triangulated directed graphs (a class of minimally acyclic persistent graphs) based on bipolar coordinates with (almost) global convergence to the desired shape. Prescribed performance control is employed to devise a decentralized control law that avoids singularities and introduces robustness against external disturbances while ensuring predefined transient and steady-state performance for the closed-loop system. Furthermore, it is shown that the proposed formation control scheme can handle formation maneuvering, scaling, and orientation specifications simultaneously. Additionally, the proposed control law is implementable in agents' arbitrarily oriented local coordinate frames using only low-cost onboard vision sensors, which are favorable for practical applications. Finally, a formation maneuvering simulation study verifies the proposed approach.


Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

Multi-agent reinforcement learning (MARL) has found various applications in the field of transportation and simulating [50, 1], stock price analyzing and trading [32, 31], wireless communication networks [12, 11, 13], and learning behaviors in social dilemmas [33, 28, 34]. MARL, however, becomes intractable due to the complex interactions among agents as the number of agents increases. A recent tractable approach is a mean-field approach by considering MARL in the regime with a large number of homogeneous agents under weak interactions [20]. According to the number of agents and learning goals, there are three subtle types of mean-field theories for MARL. The first one is called mean-field MARL (MF-MARL), which refers to the empirical average of the states or actions of a finite population. For example, [52] proposes to approximate interactions within the population of agents by averaging the actions of the overall population or neighboring agents.


Near-Optimal Distributed Linear-Quadratic Regulator for Networked Systems

arXiv.org Artificial Intelligence

This paper studies the trade-off between the degree of decentralization and the performance of a distributed controller in a linear-quadratic control setting. We study a system of interconnected agents over a graph and a distributed controller, called $\kappa$-distributed control, which lets the agents make control decisions based on the state information within distance $\kappa$ on the underlying graph. This controller can tune its degree of decentralization using the parameter $\kappa$ and thus allows a characterization of the relationship between decentralization and performance. We show that under mild assumptions, including stabilizability, detectability, and a subexponentially growing graph condition, the performance difference between $\kappa$-distributed control and centralized optimal control becomes exponentially small in $\kappa$. This result reveals that distributed control can achieve near-optimal performance with a moderate degree of decentralization, and thus it is an effective controller architecture for large-scale networked systems.


Prismal view of ethics

arXiv.org Artificial Intelligence

We shall have a hard look at ethics and try to extract insights in the form of abstract properties that might become tools. We want to connect ethics to games, talk about the performance of ethics, introduce curiosity into the interplay between competing and coordinating in well-performing ethics, and offer a view of possible developments that could unify increasing aggregates of entities. All this is under a long shadow cast by computational complexity that is quite negative about games. This analysis is the first step toward finding modeling aspects that might be used in AI ethics for integrating modern AI systems into human society.


Application of Machine Learning for Online Reputation Systems

arXiv.org Artificial Intelligence

Users on the internet usually require venues to provide better purchasing recommendations. This can be provided by a reputation system that processes ratings to provide recommendations. The rating aggregation process is a main part of reputation system to produce global opinion about the product quality. Naive methods that are frequently used do not consider consumer profiles in its calculation and cannot discover unfair ratings and trends emerging in new ratings. Other sophisticated rating aggregation methods that use weighted average technique focus on one or a few aspects of consumers profile data. This paper proposes a new reputation system using machine learning to predict reliability of consumers from consumer profile. In particular, we construct a new consumer profile dataset by extracting a set of factors that have great impact on consumer reliability, which serve as an input to machine learning algorithms. The predicted weight is then integrated with a weighted average method to compute product reputation score. The proposed model has been evaluated over three MovieLens benchmarking datasets, using 10-Folds cross validation. Furthermore, the performance of the proposed model has been compared to previous published rating aggregation models. The obtained results were promising which suggest that the proposed approach could be a potential solution for reputation systems. The results of comparison demonstrated the accuracy of our models. Finally, the proposed approach can be integrated with online recommendation systems to provide better purchasing recommendations and facilitate user experience on online shopping markets.