Agents
Lookup Table-Based Consensus Algorithm for Real-Time Longitudinal Motion Control of Connected and Automated Vehicles
Wang, Ziran, Han, Kyungtae, Kim, BaekGyu, Wu, Guoyuan, Barth, Matthew J.
Connected and automated vehicle (CAV) technology is one of the promising solutions to addressing the safety, mobility and sustainability issues of our current transportation systems. Specifically, the control algorithm plays an important role in a CAV system, since it executes the commands generated by former steps, such as communication, perception, and planning. In this study, we propose a consensus algorithm to control the longitudinal motion of CAVs in real time. Different from previous studies in this field where control gains of the consensus algorithm are pre-determined and fixed, we develop algorithms to build up a lookup table, searching for the ideal control gains with respect to different initial conditions of CAVs in real time. Numerical simulation shows that, the proposed lookup table-based consensus algorithm outperforms the authors' previous work, as well as van Arem's linear feedback-based longitudinal motion control algorithm in all four different scenarios with various initial conditions of CAVs, in terms of convergence time and maximum jerk of the simulation run.
AI comes to the AnyLogic Conference โ AnyLogic Simulation Software
In response to changes in industry, there are changes to this year's AnyLogic Conference. In a way, this mirrors the relationship developing between simulation and machine learning โ each benefitting from the input and feedback of each other. This April's AnyLogic Conference, in addition to presentations from leading simulation practitioners, will feature an expert-led panel discussion on how agent-based simulation is helping AI develop beyond deep learning. The panel will be led by Anand Rao, Global Artificial Intelligence Lead and Partner at PwC. With 33 years industry experience and an AI background, he helps senior executives manage critical issues.
Stochastic Prediction of Multi-Agent Interactions from Partial Observations
Sun, Chen, Karlsson, Per, Wu, Jiajun, Tenenbaum, Joshua B, Murphy, Kevin
We present a method that learns to integrate temporal information, from a learned dynamics model, with ambiguous visual information, from a learned vision model, in the context of interacting agents. Our method is based on a graph-structured variational recurrent neural network (Graph-VRNN), which is trained end-to-end to infer the current state of the (partially observed) world, as well as to forecast future states. We show that our method outperforms various baselines on two sports datasets, one based on real basketball trajectories, and one generated by a soccer game engine. At any given time, you can only see a subset of the players, and you may or may not be able to see the ball, yet you probably have some reasonable idea about where all the players currently are, even if they are not in the field of view. Similarly, you cannot see the future, but you may still be able to predict where the "agents" (players and ball) will be, at least approximately. Crucially, these problems are intertwined: we are able to predict future states by using a state dynamics model, but we can also use the same dynamics model to infer the current state of the world by extrapolating from the last time we saw each agent. In this paper, we present a unified approach to state estimation and future forecasting for problems of this kind. More precisely, we assume the observed data consists of a sequence of video frames, v, obtained from a stationary or moving camera.
Anytime Heuristic for Weighted Matching Through Altruism-Inspired Behavior
Danassis, Panayiotis, Filos-Ratsikas, Aris, Faltings, Boi
We present a novel anytime heuristic (ALMA), inspired by the human principle of altruism, for solving the assignment problem. ALMA is decentralized, completely uncoupled, and requires no communication between the participants. We prove an upper bound on the convergence speed that is polynomial in the desired number of resources and competing agents per resource; crucially, in the realistic case where the aforementioned quantities are bounded independently of the total number of agents/resources, the convergence time remains constant as the total problem size increases. We have evaluated ALMA under three test cases: (i) an anti-coordination scenario where agents with similar preferences compete over the same set of actions, (ii) a resource allocation scenario in an urban environment, under a constant-time constraint, and finally, (iii) an on-line matching scenario using real passenger-taxi data. In all of the cases, ALMA was able to reach high social welfare, while being orders of magnitude faster than the centralized, optimal algorithm. The latter allows our algorithm to scale to realistic scenarios with hundreds of thousands of agents, e.g., vehicle coordination in urban environments.
Marathon Environments: Multi-Agent Continuous Control Benchmarks in a Modern Video Game Engine
Recent advances in deep reinforcement learning in the paradigm of locomotion using continuous control have raised the interest of game makers for the potential of digital actors using active ragdoll. Currently, the available options to develop these ideas are either researchers' limited codebase or proprietary closed systems. We present Marathon Environments, a suite of open source, continuous control benchmarks implemented on the Unity game engine, using the Unity ML- Agents Toolkit. We demonstrate through these benchmarks that continuous control research is transferable to a commercial game engine. Furthermore, we exhibit the robustness of these environments by reproducing advanced continuous control research, such as learning to walk, run and backflip from motion capture data; learning to navigate complex terrains; and by implementing a video game input control system. We show further robustness by training with alternative algorithms found in OpenAI.Baselines. Finally, we share strategies for significantly reducing the training time.
A Sampling Approach for Proactive Project Scheduling under Generalized Time-dependent Workability Uncertainty
Song, Wen, Kang, Donghun, Zhang, Jie, Cao, Zhiguang, Xi, Hui
In real-world project scheduling applications, activity durations are often uncertain. Proactive scheduling can effectively cope with the duration uncertainties, by generating robust baseline solutions according to a priori stochastic knowledge. However, most of the existing proactive approaches assume that the duration uncertainty of an activity is not related to its scheduled start time, which may not hold in many real-world scenarios. In this paper, we relax this assumption by allowing the duration uncertainty to be time-dependent, which is caused by the uncertainty of whether the activity can be executed on each time slot. We propose a stochastic optimization model to find an optimal Partial-order Schedule (POS) that minimizes the expected makespan. This model can cover both the time-dependent uncertainty studied in this paper and the traditional time-independent duration uncertainty. To circumvent the underlying complexity in evaluating a given solution, we approximate the stochastic optimization model based on Sample Average Approximation (SAA). Finally, we design two efficient branch-and-bound algorithms to solve the NP-hard SAA problem. Empirical evaluation confirms that our approach can generate high-quality proactive solutions for a variety of uncertainty distributions.
Embedded Agency
Demski, Abram, Garrabrant, Scott
Traditional models of rational action treat the agent as though it is cleanly separated from its environment, and can act on that environment from the outside. Such agents have a known functional relationship with their environment, can model their environment in every detail, and do not need to reason about themselves or their internal parts. We provide an informal survey of obstacles to formalizing good reasoning for agents embedded in their environment. Such agents must optimize an environment that is not of type ``function''; they must rely on models that fit within the modeled environment; and they must reason about themselves as just another physical system, made of parts that can be modified and that can work at cross purposes.
Testing Preferential Domains Using Sampling
Dey, Palash, Nath, Swaprava, Shakya, Garima
A preferential domain is a collection of sets of preferences which are linear orders over a set of alternatives. These domains have been studied extensively in social choice theory due to both its practical importance and theoretical elegance. Examples of some extensively studied preferential domains include single peaked, single crossing, Euclidean, etc. In this paper, we study the sample complexity of testing whether a given preference profile is close to some specific domain. We consider two notions of closeness: (a) closeness via preferences, and (b) closeness via alternatives. We further explore the effect of assuming that the {\em outlier} preferences/alternatives to be random (instead of arbitrary) on the sample complexity of the testing problem. In most cases, we show that the above testing problem can be solved with high probability for all commonly used domains by observing only a small number of samples (independent of the number of preferences, $n$, and often the number of alternatives, $m$). In the remaining few cases, we prove either impossibility results or $\Omega(n)$ lower bound on the sample complexity. We complement our theoretical findings with extensive simulations to figure out the actual constant factors of our asymptotic sample complexity bounds.
An Influence Network Model to Study Discrepancies in Expressed and Private Opinions
Ye, Mengbin, Qin, Yuzhen, Govaert, Alain, Anderson, Brian D. O., Cao, Ming
In many social situations, a discrepancy arises between an individual's private and expressed opinions on a given topic. Motivated by Solomon Asch's seminal experiments on social conformity and other related socio-psychological works, we propose a novel opinion dynamics model to study how such a discrepancy can arise in general social networks of interpersonal influence. Each individual in the network has both a private and an expressed opinion: an individual's private opinion evolves under social influence from the expressed opinions of the individual's neighbours, while the individual determines his or her expressed opinion under a pressure to conform to the average expressed opinion of his or her neighbours, termed the local public opinion. General conditions on the network that guarantee exponentially fast convergence of the opinions to a limit are obtained. Further analysis of the limit yields several semi-quantitative conclusions, which have insightful social interpretations, including the establishing of conditions that ensure every individual in the network has such a discrepancy. Last, we show the generality and validity of the model by using it to explain and predict the results of Solomon Asch's seminal experiments.
Emergent Coordination Through Competition
Liu, Siqi, Lever, Guy, Merel, Josh, Tunyasuvunakool, Saran, Heess, Nicolas, Graepel, Thore
We study the emergence of cooperative behaviors in reinforcement learning agents by introducing a challenging competitive multi-agent soccer environment with continuous simulated physics. We demonstrate that decentralized, population-based training with co-play can lead to a progression in agents' behaviors: from random, to simple ball chasing, and finally showing evidence of cooperation. Our study highlights several of the challenges encountered in large scale multi-agent training in continuous control. In particular, we demonstrate that the automatic optimization of simple shaping rewards, not themselves conducive to co-operative behavior, can lead to long-horizon team behavior. We further apply an evaluation scheme, grounded by game theoretic principals, that can assess agent performance in the absence of pre-defined evaluation tasks or human baselines.