Agents
Learning Sharing Behaviors with Arbitrary Numbers of Agents
Metcalf, Katherine, Theobald, Barry-John, Apostoloff, Nicholas
We propose a method for modeling and learning turn-taking behaviors for accessing a shared resource. We model the individual behavior for each agent in an interaction and then use a multi-agent fusion model to generate a summary over the expected actions of the group to render the model independent of the number of agents. The individual behavior models are weighted finite state transducers (WFSTs) with weights dynamically updated during interactions, and the multi-agent fusion model is a logistic regression classifier. We test our models in a multi-agent tower-building environment, where a Q-learning agent learns to interact with rule-based agents. Our approach accurately models the underlying behavior patterns of the rule-based agents with accuracy ranging between 0.63 and 1.0 depending on the stochasticity of the other agent behaviors. In addition we show using KL-divergence that the model accurately captures the distribution of next actions when interacting with both a single agent (KL-divergence < 0.1) and with multiple agents (KL-divergence < 0.37). Finally, we demonstrate that our behavior model can be used by a Q-learning agent to take turns in an interactive turn-taking environment.
New Approaches to Inverse Structural Modification Theory using Random Projections
Cheema, Prasad, Alamdari, Mehrisadat M., Vio, Gareth A.
For example during flight, an aircraft's modal properties are known to change with both altitude and velocity. It is thus important to quantify these changes given only a truncated set of modal data, which is usually the case experimentally. This procedure is formally known as the generalised inverse eigenvalue problem. In this paper we experimentally show that first-order gradient-based methods that optimise objective functions defined over a modal are prohibitive due to the required small step sizes. This in turn leads to the justification of using a non-gradient, black box optimiser in the form of particle swarm optimisation. We further show how it is possible to solve such inverse eigenvalue problems in a lower dimensional space by the use of random projections, which in many cases reduces the total dimensionality of the optimisation problem by 80% to 99%. Two example problems are explored involving a ten-dimensional mass-stiffness toy problem, and a one-dimensional finite element mass-stiffness approximation for a Boeing 737-300 aircraft.
Building Ethically Bounded AI
Rossi, Francesca, Mattei, Nicholas
The more AI agents are deployed in scenarios with possibly unexpected situations, the more they need to be flexible, adaptive, and creative in achieving the goal we have given them. Thus, a certain level of freedom to choose the best path to the goal is inherent in making AI robust and flexible enough. At the same time, however, the pervasive deployment of AI in our life, whether AI is autonomous or collaborating with humans, raises several ethical challenges. AI agents should be aware and follow appropriate ethical principles and should thus exhibit properties such as fairness or other virtues. These ethical principles should define the boundaries of AI's freedom and creativity. However, it is still a challenge to understand how to specify and reason with ethical boundaries in AI agents and how to combine them appropriately with subjective preferences and goal specifications. Some initial attempts employ either a data-driven example-based approach for both, or a symbolic rule-based approach for both. We envision a modular approach where any AI technique can be used for any of these essential ingredients in decision making or decision support systems, paired with a contextual approach to define their combination and relative weight. In a world where neither humans nor AI systems work in isolation, but are tightly interconnected, e.g., the Internet of Things, we also envision a compositional approach to building ethically bounded AI, where the ethical properties of each component can be fruitfully exploited to derive those of the overall system. In this paper we define and motivate the notion of ethically-bounded AI, we describe two concrete examples, and we outline some outstanding challenges.
Hierarchical Fuzzy Opinion Networks: Top-Down for Social Organizations and Bottom-Up for Election
A fuzzy opinion is a Gaussian fuzzy set with the center representing the opinion and the standard deviation representing the uncertainty about the opinion, and a fuzzy opinion network is a connection of a number of fuzzy opinions in a structured way. In this paper, we propose: (a) a top-down hierarchical fuzzy opinion network to model how the opinion of a top leader is penetrated into the members in social organizations, and (b) a bottom-up fuzzy opinion network to model how the opinions of a large number of agents are agglomerated layer-by-layer into a consensus or a few opinions in the social processes such as an election. For the top-down hierarchical fuzzy opinion network, we prove that the opinions of all the agents converge to the leaders opinion, but the uncertainties of the agents in different groups are generally converging to different values. We demonstrate that the speed of convergence is greatly improved by organizing the agents in a hierarchical structure of small groups. For the bottom-up hierarchical fuzzy opinion network, we simulate how a wide spectrum of opinions are negotiating and summarizing with each other in a layer-by-layer fashion in some typical situations.
Communication-Efficient Distributed Reinforcement Learning
Chen, Tianyi, Zhang, Kaiqing, Giannakis, Georgios B., Başar, Tamer
This paper studies the distributed reinforcement learning (DRL) problem involving a central controller and a group of learners. Two DRL settings that find broad applications are considered: multi-agent reinforcement learning (RL) and parallel RL. In both settings, frequent information exchange between the learners and the controller are required. However, for many distributed systems, e.g., parallel machines for training deep RL algorithms, and multi-robot systems for learning the optimal coordination strategies, the overhead caused by frequent communication is not negligible and becomes the bottleneck of the overall performance. To overcome this challenge, we develop a new policy gradient method that is amenable to efficient implementation in such communication-constrained settings. By adaptively skipping the policy gradient communication, our method can reduce the communication overhead without degrading the learning accuracy. Analytically, we can establish that i) the convergence rate of our algorithm is the same as the vanilla policy gradient for the DRL tasks; and, ii) if the distributed computing units are heterogeneous in terms of their reward functions and initial state distributions, the number of communication rounds needed to achieve a targeted learning accuracy is reduced. Numerical experiments on a popular multi-agent RL benchmark corroborate the significant communication reduction of our algorithm compared to the alternatives.
Building Ethics into Artificial Intelligence
Yu, Han, Shen, Zhiqi, Miao, Chunyan, Leung, Cyril, Lesser, Victor R., Yang, Qiang
As artificial intelligence (AI) systems become increasingly ubiquitous, the topic of AI governance for ethical decision-making by AI has captured public imagination. Within the AI research community, this topic remains less familiar to many researchers. In this paper, we complement existing surveys, which largely focused on the psychological, social and legal discussions of the topic, with an analysis of recent advances in technical solutions for AI governance. By reviewing publications in leading AI conferences including AAAI, AAMAS, ECAI and IJCAI, we propose a taxonomy which divides the field into four areas: 1) exploring ethical dilemmas; 2) individual ethical decision frameworks; 3) collective ethical decision frameworks; and 4) ethics in human-AI interactions. We highlight the intuitions and key techniques used in each approach, and discuss promising future research directions towards successful integration of ethical AI systems into human societies.
Scope of Research on Particle Swarm Optimization Based Data Clustering
Metre, Vishakha A, Deshmukh, Mr Pramod B
Optimization is nothing but a mathematical technique which finds maxima or minima of any function of concern in some realistic region. Different optimization techniques are proposed which are competing for the best solution. Particle Swarm Optimization (PSO) is a new, advanced, and most powerful optimization methodology that performs empirically well on several optimization problems. It is the extensively used Swarm Intelligence (SI) inspired optimization algorithm used for finding the global optimal solution in a multifaceted search region. Data clustering is one of the challenging real world applications that invite the eminent research works in variety of fields. Applicability of different PSO variants to data clustering is studied in the literature, and the analyzed research work shows that, PSO variants give poor results for multidimensional data. This paper describes the different challenges associated with multidimensional data clustering and scope of research on optimizing the clustering problems using PSO. We also propose a strategy to use hybrid PSO variant for clustering multidimensional numerical, text and image data.
Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning
Zhang, Kaiqing, Yang, Zhuoran, Liu, Han, Zhang, Tong, Başar, Tamer
Despite the increasing interest in multi-agent reinforcement learning (MARL) in the community, understanding its theoretical foundation has long been recognized as a challenging problem. In this work, we make an attempt towards addressing this problem, by providing finite-sample analyses for fully decentralized MARL. Specifically, we consider two fully decentralized MARL settings, where teams of agents are connected by time-varying communication networks, and either collaborate or compete in a zero-sum game, without the absence of any central controller. These settings cover many conventional MARL settings in the literature. For both settings, we develop batch MARL algorithms that can be implemented in a fully decentralized fashion, and quantify the finite-sample errors of the estimated action-value functions. Our error analyses characterize how the function class, the number of samples within each iteration, and the number of iterations determine the statistical accuracy of the proposed algorithms. Our results, compared to the finite-sample bounds for single-agent RL, identify the involvement of additional error terms caused by decentralized computation, which is inherent in our decentralized MARL setting. To our knowledge, our work appears to be the first finite-sample analyses for MARL, which sheds light on understanding both the sample and computational efficiency of MARL algorithms.
An Evolutionary Hierarchical Interval Type-2 Fuzzy Knowledge Representation System (EHIT2FKRS) for Travel Route Assignment
Zouari, Mariam, Baklouti, Nesrine, Medina, Javier Sanchez, Ayed, Mounir Ben, Alimi, Adel M.
Urban Traffic Networks are characterized by high dynamics of traffic flow and increased travel time, including waiting times. This leads to more complex road traffic management. The present research paper suggests an innovative advanced traffic management system based on Hierarchical Interval Type-2 Fuzzy Logic model optimized by the Particle Swarm Optimization (PSO) method. The aim of designing this system is to perform dynamic route assignment to relieve traffic congestion and limit the unexpected fluctuation effects on traffic flow. The suggested system is executed and simulated using SUMO, a well-known microscopic traffic simulator. For the present study, we have tested four large and heterogeneous metropolitan areas located in the cities of Sfax, Luxembourg, Bologna and Cologne. The experimental results proved the effectiveness of learning the Hierarchical Interval type-2 Fuzzy logic using real time particle swarm optimization technique PSO to accomplish multiobjective optimality regarding two criteria: number of vehicles that reach their destination and average travel time. The obtained results are encouraging, confirming the efficiency of the proposed system.
Chore division on a graph
Bouveret, Sylvain, Cechlárová, Katarína, Lesca, Julien
The paper considers fair allocation of indivisible nondisposable items that generate disutility (chores). We assume that these items are placed in the vertices of a graph and each agent's share has to form a connected subgraph of this graph. Although a similar model has been investigated before for goods, we show that the goods and chores settings are inherently different. In particular, it is impossible to derive the solution of the chores instance from the solution of its naturally associated fair division instance. We consider three common fair division solution concepts, namely proportionality, envy-freeness and equitability, and two individual disutility aggregation functions: additive and maximum based. We show that deciding the existence of a fair allocation is hard even if the underlying graph is a path or a star. We also present some efficiently solvable special cases for these graph topologies.