Agents
Inverse Graph Learning over Optimization Networks
Matta, Vincenzo, Santos, Augusto, Sayed, Ali H.
Many inferential and learning tasks can be accomplished efficiently by means of distributed optimization algorithms where the network topology plays a critical role in driving the local interactions among neighboring agents. There is a large body of literature examining the effect of the graph structure on the performance of optimization strategies. In this article, we examine the inverse problem and consider the reverse question: How much information does observing the behavior at the nodes convey about the underlying network structure used for optimization? Over large-scale networks, the difficulty of addressing such inverse questions (or problems) is compounded by the fact that usually only a limited portion of nodes can be probed, giving rise to a second important question: Despite the presence of several unobserved nodes, are partial and local observations still sufficient to discover the graph linking the probed nodes? The article surveys recent advances on this inverse learning problem and related questions. Examples of applications are provided to illustrate how the interplay between graph learning and distributed optimization arises in practice, e.g., in cognitive engineered systems such as distributed detection, or in other real-world problems such as the mechanism of opinion formation over social networks and the mechanism of coordination in biological networks. A unifying framework for examining the reconstruction error will be described, which allows to devise and examine various estimation strategies enabling successful graph learning. The relevance of specific network attributes, such as sparsity versus density of connections, and node degree concentration, is discussed in relation to the topology inference goal. It is shown how universal (i.e., data-driven) clustering algorithms can be exploited to solve the graph learning problem.
A Machine Learning Framework for Solving High-Dimensional Mean Field Game and Mean Field Control Problems
Ruthotto, Lars, Osher, Stanley, Li, Wuchen, Nurbekyan, Levon, Fung, Samy Wu
Mean field games (MFG) and mean field control (MFC) are critical classes of multi-agent models for efficient analysis of massive populations of interacting agents. Their areas of application span topics in economics, finance, game theory, industrial engineering, crowd motion, and more. In this paper, we provide a flexible machine learning framework for the numerical solution of potential MFG and MFC models. State-of-the-art numerical methods for solving such problems utilize spatial discretization that leads to a curse-of-dimensionality. We approximately solve high-dimensional problems by combining Lagrangian and Eulerian viewpoints and leveraging recent advances from machine learning. More precisely, we work with a Lagrangian formulation of the problem and enforce the underlying Hamilton-Jacobi-Bellman (HJB) equation that is derived from the Eulerian formulation. Finally, a tailored neural network parameterization of the MFG/MFC solution helps us avoid any spatial discretization. Our numerical results include the approximate solution of 100-dimensional instances of optimal transport and crowd motion problems on a standard work station. These results open the door to much-anticipated applications of MFG and MFC models that were beyond reach with existing numerical methods.
Robust Online Model Adaptation by Extended Kalman Filter with Exponential Moving Average and Dynamic Multi-Epoch Strategy
Abuduweili, Abulikemu, Liu, Changliu
EDU Robotics Institute, Carnegie Mellon University, Pittsburgh, P A 15213, USA Abstract High fidelity behavior prediction of intelligent agents is critical in many applications. However, the prediction model trained on the training set may not generalize to the testing set due to domain shift and time variance. The challenge motivates the adoption of online adaptation algorithms to update prediction models in real-time to improve the prediction performance. Inspired by Extended Kalman Filter (EKF), this paper introduces a series of online adaptation methods, which are applicable to neural network-based models. A base adaptation algorithm Modified EKF with forgetting factor (MEKF λ) is introduced first, followed by exponential moving average filtering techniques. Then this paper introduces a dynamic multi-epoch update strategy to effectively utilize samples received in real time. With all these extensions, we propose a robust online adaptation algorithm: MEKF with Exponential Moving Average and Dynamic Multi-Epoch strategy (MEKF EMA-DME). The proposed algorithm outperforms existing methods as demonstrated in experiments. Keywords: Online adaptation, extended Kalman filter, exponential moving average, optimization 1. Introduction Supervised learning has been widely used to obtain models to predict the behaviors of intelligent agents Rudenko et al. (2019). Behavior prediction is a sub-topic of time series prediction Weigend (2018), which includes but is not limited to vehicle trajectory prediction during autonomous driving Lef evre et al. (2014) and human-motion prediction during human-robot collaboration Cheng et al. (2019).
Counterfactual thinking in cooperation dynamics
Pereira, Luis Moniz, Santos, Francisco C.
Counterfactual Thinking is a human cognitive ability studied in a wide variety of domains. It captures the process of reasoning about a past event that did not occur, namely what would have happened had this event occurred, or, otherwise, to reason about an event that did occur but what would ensue had it not. Given the wide cognitive empowerment of counterfactual reasoning in the human individual, the question arises of how the presence of individuals with this capability may improve cooperation in populations of self-regarding individuals. Here we propose a mathematical model, grounded on Evolutionary Game Theory, to examine the population dynamics emerging from the interplay between counterfactual thinking and social learning (i.e., individuals that learn from the actions and success of others) whenever the individuals in the population face a collective dilemma. Our results suggest that counterfactual reasoning fosters coordination in collective action problems occurring in large populations, and has a limited impact on cooperation dilemmas in which coordination is not required. Moreover, we show that a small prevalence of individuals resorting to counterfactual thinking is enough to nudge an entire population towards highly cooperative standards.
Balancing the Tradeoff between Profit and Fairness in Rideshare Platforms During High-Demand Hours
Nanda, Vedant, Xu, Pan, Sankararaman, Karthik Abinav, Dickerson, John P., Srinivasan, Aravind
Rideshare platforms, when assigning requests to drivers, tend to maximize profit for the system and/or minimize waiting time for riders. Such platforms can exacerbate biases that drivers may have over certain types of requests. We consider the case of peak hours when the demand for rides is more than the supply of drivers. Drivers are well aware of their advantage during the peak hours and can choose to be selective about which rides to accept. Moreover, if in such a scenario, the assignment of requests to drivers (by the platform) is made only to maximize profit and/or minimize wait time for riders, requests of a certain type (e.g. from a non-popular pickup location, or to a non-popular drop-off location) might never be assigned to a driver. Such a system can be highly unfair to riders. However, increasing fairness might come at a cost of the overall profit made by the rideshare platform. To balance these conflicting goals, we present a flexible, non-adaptive algorithm, \lpalg, that allows the platform designer to control the profit and fairness of the system via parameters $\alpha$ and $\beta$ respectively. We model the matching problem as an online bipartite matching where the set of drivers is offline and requests arrive online. Upon the arrival of a request, we use \lpalg to assign it to a driver (the driver might then choose to accept or reject it) or reject the request. We formalize the measures of profit and fairness in our setting and show that by using \lpalg, the competitive ratios for profit and fairness measures would be no worse than $\alpha/e$ and $\beta/e$ respectively. Extensive experimental results on both real-world and synthetic datasets confirm the validity of our theoretical lower bounds. Additionally, they show that $\lpalg$ under some choice of $(\alpha, \beta)$ can beat two natural heuristics, Greedy and Uniform, on \emph{both} fairness and profit.
From Reinforcement Learning to Optimal Control: A unified framework for sequential decisions
There are over 15 distinct communities that work in the general area of sequential decisions and information, often referred to as decisions under uncertainty or stochastic optimization. We focus on two of the most important fields: stochastic optimal control, with its roots in deterministic optimal control, and reinforcement learning, with its roots in Markov decision processes. Building on prior work, we describe a unified framework that covers all 15 different communities, and note the strong parallels with the modeling framework of stochastic optimal control. By contrast, we make the case that the modeling framework of reinforcement learning, inherited from discrete Markov decision processes, is quite limited. Our framework (and that of stochastic control) is based on the core problem of optimizing over policies. We describe four classes of policies that we claim are universal, and show that each of these two fields have, in their own way, evolved to include examples of each of these four classes.
Joint Interaction and Trajectory Prediction for Autonomous Driving using Graph Neural Networks
Lee, Donsuk, Gu, Yiming, Hoang, Jerrick, Marchetti-Bowick, Micol
In this work, we aim to predict the future motion of vehicles in a traffic scene by explicitly modeling their pairwise interactions. Specifically, we propose a graph neural network that jointly predicts the discrete interaction modes and 5-second future trajectories for all agents in the scene. Our model infers an interaction graph whose nodes are agents and whose edges capture the long-term interaction intents among the agents. In order to train the model to recognize known modes of interaction, we introduce an auto-labeling function to generate ground truth interaction labels. Using a large-scale real-world driving dataset, we demonstrate that jointly predicting the trajectories along with the explicit interaction types leads to significantly lower trajectory error than baseline methods. Finally, we show through simulation studies that the learned interaction modes are semantically meaningful.
Putting Ridesharing to the Test: Efficient and Scalable Solutions and the Power of Dynamic Vehicle Relocation
Danassis, Panayiotis, Sakota, Marija, Filos-Ratsikas, Aris, Faltings, Boi
Ridesharing is a coordination problem in its core. Traditionally it has been solved in a centralized manner by ridesharing platforms. Yet, to truly allow for scalable solutions, we needs to shift from traditional approaches, to multi-agent systems, ideally run on-device. In this paper, we show that a recently proposed heuristic (ALMA), which exhibits such properties, offers an efficient, end-to-end solution for the ridesharing problem. Moreover, by utilizing simple relocation schemes we significantly improve QoS metrics, by up to 50%. To demonstrate the latter, we perform a systematic evaluation of a diverse set of algorithms for the ridesharing problem, which is, to the best of our knowledge, one of the largest and most comprehensive to date. Our evaluation setting is specifically designed to resemble reality as closely as possible. In particular, we evaluate 12 different algorithms over 12 metrics related to global efficiency, complexity, passenger, driver, and platform incentives.
Classifying Inconsistency Measures Using Graphs
De Bona, Glauber, Grant, John, Hunter, Anthony, Konieczny, Sebastien
The aim of measuring inconsistency is to obtain an evaluation of the imperfections in a set of formulas, and this evaluation may then be used to help decide on some course of action (such as rejecting some of the formulas, resolving the inconsistency, seeking better sources of information, etc). A number of proposals have been made to define measures of inconsistency. Each has its rationale. But to date, it is not clear how to delineate the space of options for measures, nor is it clear how we can classify measures systematically. To address these problems, we introduce a general framework for comparing syntactic measures of inconsistency. It is based on the notion of an inconsistency graph for each knowledgebase (a bipartite graph with a set of vertices representing formulas in the knowledgebase, a set of vertices representing minimal inconsistent subsets of the knowledgebase, and edges representing that a formula belongs to a minimal inconsistent subset). We then show that various measures can be computed using the inconsistency graph. Then we introduce abstractions of the inconsistency graph and use them to construct a hierarchy of syntactic inconsistency measures. Furthermore, we extend the inconsistency graph concept with a labeling that extends the hierarchy to include some other types of inconsistency measures.
LTLf Synthesis with Fairness and Stability Assumptions
Zhu, Shufang, De Giacomo, Giuseppe, Pu, Geguang, Vardi, Moshe
In synthesis, assumptions are constraints on the environment that rule out certain environment behaviors. A key observation here is that even if we consider systems with LTL f goals on finite traces, environment assumptions need to be expressed over infinite traces, since accomplishing the agent goals may require an unbounded number of environment action. To solve synthesis with respect to finite-trace LTL f goals under infinite-trace assumptions, we could reduce the problem to LTL synthesis. Unfortunately, while synthesis in LTL f and in LTLhave the same worst-case complexity (both 2EXPTIME-complete), the algorithms available for LTLsyn-thesis are much more difficult in practice than those for LTL f synthesis. In this work we show that in interesting cases we can avoid such a detour to LTLsynthesis and keep the simplicity of LTL f synthesis. Specifically, we develop a BDD-based fixpoint-based technique for handling basic forms of fairness and of stability assumptions. We show, empirically, that this technique performs much better than standard LTLsynthesis. Introduction In many situations we are interested in expressing properties over an unbounded but finite sequence of successive states. Linear-time Temporal Logic over finite traces ( LTL f) and its variants have been thoroughly investigated for doing so. There has been broad research for logical reasoning (De Gi-acomo and V ardi 2013; Li et al. 2019), synthesis (De Gi-acomo and V ardi 2015; Camacho et al. 2018), and planning (Camacho et al. 2017; De Giacomo and Rubin 2018). Recently synthesis under assumptions in LTL f has attracted specific interest (De Giacomo and Rubin 2018; Camacho, Bienvenu, and McIlraith 2018). First, planning for LTL f goals can be considered as a form of LTL f synthesis under assumptions, where the assumptions are the dynamics of the environment encoded in the planning domain (Green 1969; Camacho, Bienvenu, and McIlraith 2018; Aminof et al. 2018; Aminof et al. 2019). Synthesis under assumptions has been extensively studied in LTL, where environment assumptions are expressed as LTL formulas (Chatterjee and Henzinger 2007; Chatter-jee, Henzinger, and Jobstmann 2008; D'Ippolito et al. 2013; Bloem, Ehlers, and K onighofer 2015; Brenguier, Raskin, and Sankur 2017). In fact, LTL formulas can be used as assumptions as long as it is guaranteed that the environment is able to behave so as to keep the assumptions true, i.e., assumptions are environment realizable. Under these circumstances, it is possible to reduce synthesis for LTL goal ψ G under assumptions ψ A to standard synthesis for ψ A ψ G. Note that because of the guarantee of ψ A being environment realizable, no agent strategy can win ψ A ψ G by falsifying ψ A. See (Aminof et al. 2019) for a discussion.