Agents
Transfer Learning versus Multi-agent Learning regarding Distributed Decision-Making in Highway Traffic
Schutera, Mark, Goby, Niklas, Neumann, Dirk, Reischl, Markus
Transportation and traffic are currently undergoing a rapid increase in terms of both scale and complexity. At the same time, an increasing share of traffic participants are being transformed into agents driven or supported by artificial intelligence resulting in mixed-intelligence traffic. This work explores the implications of distributed decision-making in mixed-intelligence traffic. The investigations are carried out on the basis of an online-simulated highway scenario, namely the MIT \emph{DeepTraffic} simulation. In the first step traffic agents are trained by means of a deep reinforcement learning approach, being deployed inside an elitist evolutionary algorithm for hyperparameter search. The resulting architectures and training parameters are then utilized in order to either train a single autonomous traffic agent and transfer the learned weights onto a multi-agent scenario or else to conduct multi-agent learning directly. Both learning strategies are evaluated on different ratios of mixed-intelligence traffic. The strategies are assessed according to the average speed of all agents driven by artificial intelligence. Traffic patterns that provoke a reduction in traffic flow are analyzed with respect to the different strategies.
Planification par fusions incr\'ementales de graphes
Pellier, Damien, Belaidi, lias.
In this paper, we introduce a generic and fresh model for distributed planning called "Distributed Planning Through Graph Merging" ({\sf DPGM}). This model unifies the different steps of the distributed planning process into a single step. Our approach is based on a planning graph structure for the agent reasoning and a CSP mechanism for the individual plan extraction and the coordination. We assume that no agent can reach the global goal alone. Therefore the agents must cooperate, {\it i.e.,} take in into account potential positive interactions between their activities to reach their common shared goal. The originality of our model consists in considering as soon as possible, {\it i.e.,} in the individual planning process, the positive and the negative interactions between agents activities in order to reduce the search cost of a global coordinated solution plan.
Assumption-Based Planning
Pellier, Damien, Fiorino, Humbert
The purpose of the paper is to introduce a new approach of planning called Assumption-Based Planning. This approach is a very interesting way to devise a planner based on a multi-agent system in which the production of a global shared plan is obtained by conjecture/refutation cycles. Contrary to classical approaches, our contribution relies on the agents reasoning that leads to the production of a plan from planning domains. To take into account complex environments and the partial agents knowledge, we propose to consider the planning problem as a defeasible reasoning where the agents exchange proposals and counter-proposals and are able to reason about uncertainty. The argumentation dialogue between agents must not be viewed as a negotiation process but as an investigation process in order to build a plan. In this paper, we focus on the mechanisms that allow an agent to produce `reasonable' proposals according to its knowledge.
The Text-Based Adventure AI Competition
Atkinson, Timothy, Baier, Hendrik, Copplestone, Tara, Devlin, Sam, Swan, Jerry
Abstract--In 2016, 2017, and 2018 at the IEEE Conference on Computational Intelligence in Games, the authors of this paper ran a competition for agents that can play classic text-based adventure games. This competition fills a gap in existing game AI competitions that have typically focussed on traditional card/board games or modern video games with graphical interfaces. By providing a platform for evaluating agents in textbased adventures, the competition provides a novel benchmark for game AI with unique challenges for natural language understanding and generation. This paper summarises the three competitions ran in 2016, 2017, and 2018 (including details of open source implementations of both the competition framework and our competitors) and presents the results of an improved evaluation of these competitors across 20 games. I. INTRODUCTION Before the widespread availability of graphical displays, text adventures were one of the few game genres that owed their existence solely to computing. The first text adventure was Colossal Cave (also known simply as Adventure), written in 1976 by Will Crowther for the PDP-10 mainframe [1]. With the advent of home computing in the late 1970s, Colossal Cave and other games such as Zork were enjoyed by many. The majority of early text adventures used a narration-action loop that accepted simple commands of the general form VERB or VERB NOUN (e.g. In response to such commands, the programs provided a description of the immediate environment, e.g. 'You are in an open field on the west side of a white house with a boarded front door. There is a small mailbox here.'
The Robot Economy Will Run on Blockchain - Issue 65: In Plain Sight
Our future will be bright, fast--and full of robots. It'll be more Asimov than Terminator: servant robots, more or less similar to us. Some will be upright androids, but most will be boxes filled with computer chips running software agents. And there will be a lot of them. Forecasts predict that, within just three years, we'll have 1.7 million robots in industry, 32 million in our households, and 400,000 in professional offices.1 Robots will begin to run our factories.
AI's next battlefield is literally the battlefield: In 20 years, bots will fight our wars โ Army boffin
The notion of deploying armed human soldiers on the ground to fight wars will disappear over time, according to one of America's top military scientists. "We have to get used to the radical idea that we, human beings, will be just one species of intelligent beings," Alexander Kott, chief of the Network Science Division of the US Army Research Laboratory, told the Conference on Applied Machine Learning for Information Security (CAMLIS) on Friday. Kott predicted a dystopian future where human warriors share the battlefield with intelligent agents in the form of robots, sensors, smart weapons, autonomous vehicles, and wearable gizmos. These exist today to some degree, however, in the future they will be much more intelligent, and use machine-learning software to automatically take in fresh information and make decisions in a constantly changing environment. "It's coming and will be a reality in 20 years," Kott said.
A Proximal Zeroth-Order Algorithm for Nonconvex Nonsmooth Problems
Abstract-- In this paper, we focus on solving an important class of nonconvex optimization problems which includes many problems for example signal processing over a networked multi-agent system and distributed learning over networks. Motivated by many applications in which the local objective function is the sum of smooth but possibly nonconvex part, and non-smooth but convex part subject to a linear equality constraint, this paper proposes a proximal zeroth-order primal dual algorithm (PZO-PDA) that accounts for the information structure of the problem. This algorithm only utilize the zeroth-order information (i.e., the functional values) of smooth functions, yet the flexibility is achieved for applications that only noisy information of the objective function is accessible, where classical methods cannot be applied. We prove convergence and rate of convergence for PZO-PDA. Numerical experiments are provided to validate the theoretical results.
Multi-Agent Fully Decentralized Off-Policy Learning with Linear Convergence Rates
Cassano, Lucas, Yuan, Kun, Sayed, Ali H.
In this paper we develop a fully decentralized algorithm for policy evaluation with off-policy learning, linear function approximation, and $O(n)$ complexity in both computation and memory requirements. The proposed algorithm is of the variance reduced kind and achieves linear convergence. We consider the case where a collection of agents have distinct and fixed size datasets gathered following different behavior policies (none of which is required to explore the full state space) and they all collaborate to evaluate a common target policy. The network approach allows all agents to converge to the optimal solution even in situations where neither agent can converge on its own without cooperation. We provide simulations to illustrate the effectiveness of the method.
Finding Options that Minimize Planning Time
Jinnai, Yuu, Abel, David, Littman, Michael, Konidaris, George
While adding temporally abstract actions, or options, to an agent's action repertoire can often accelerate learning and planning, existing approaches for determining which specific options to add are largely heuristic. We aim to formalize the problem of selecting the optimal set of options for planning, in two contexts: 1) finding the set of $k$ options that minimize the number of value-iteration passes until convergence, and 2) computing the smallest set of options so that planning converges in less than a given maximum of $\ell$ value-iteration passes. We first show that both problems are NP-hard. We then provide a polynomial-time approximation algorithm for computing the optimal options for tasks with bounded return and goal states. We prove that the algorithm has bounded suboptimality for deterministic tasks. Finally, we empirically evaluate its performance against both the optimal options and a representative collection of heuristic approaches in simple grid-based domains including the classic four rooms problem.
Reinforcement Learning Decoders for Fault-Tolerant Quantum Computation
Sweke, Ryan, Kesselring, Markus S., van Nieuwenburg, Evert P. L., Eisert, Jens
In order to implement large scale quantum computations it is necessary to be able to store and manipulate quantum information in a manner that is robust to the unavoidable errors introduced through interaction of the physical qubits with a noisy environment. The known strategy for achieving such robustness is to encode a single logical qubit into the state of many physical qubits, via a quantum error correcting code, from which it is possible to actively diagnose and correct errors that may occur [1, 2]. While many quantum error correcting codes exist, topological quantum codes [1-8], in which only local operations are required to diagnose and correct errors, are of particular interest as a result of their experimental feasibility [9-15]. In particular, the surface code has emerged as an especially promising candidate for large-scale fault-tolerant quantum computation, due to the combination of its comparatively low overhead and locality requirements, coupled with the availability of convenient strategies for the implementation of all required logical gates [16, 17]. In fact, current road maps towards the realization of robust quantum computing have identified surface code based approaches as the most feasible methodology for achieving this goal [18].