Agents
Knowledge, Fairness, and Social Constraints
Aziz, Haris ( Data61 , CSIRO and UNSW ) | Bouveret, Sylvain ( Univ . Grenoble- Alpes ) | Caragiannis, Ioannis (University of Patras) | Giagkousi, Ira (University of Patras) | Lang, Jérôme ( CNRS , U. Paris- Dauphine , PSL )
In the context of fair allocation of indivisible items, fairness concepts often compare the satisfaction of an agent to the satisfaction she would have from items that are not allocated to her: in particular, envy-freeness requires that no agent prefers the share of someone else to her own share. We argue that these notions could also be defined relative to the knowledge that an agent has on how the items that she does not receive are distributed among other agents. We define a family of epistemic notions of envy-freeness, parameterized by a social graph, where an agent observes the share of her neighbours but not of her non-neighbours. We also define an intermediate notion between envy-freeness and proportionality, also parameterized by a social graph. These weaker notions of envy-freeness are useful when seeking a fair allocation, since envy-freeness is often too strong. We position these notions with respect to known ones, thus revealing new rich hierarchies of fairness concepts. Finally, we present a very general framework that covers all the existing and many new fairness concepts.
Truthful and Near-Optimal Mechanisms for Welfare Maximization in Multi-Winner Elections
Bhaskar, Umang (Tata Institute of Fundamental Research, Mumbai) | Dani, Varsha (University of New Mexico) | Ghosh, Abheek (Indian Institute of Technology, Guwahati)
Mechanisms for aggregating the preferences of agents in elections need to balance many different considerations, including efficiency, information elicited from agents, and manipulability. We consider the utilitarian social welfare of mechanisms for preference aggregation, measured by the distortion. We show that for a particular input format called threshold approval voting, where each agent is presented with an independently chosen threshold, there is a mechanism with nearly optimal distortion when the number of voters is large. Threshold mechanisms are potentially manipulable, but place a low informational burden on voters. We then consider truthful mechanisms. For the widely-studied class of ordinal mechanisms which elicit the rankings of candidates from each agent, we show that truthfulness essentially imposes no additional loss of welfare. We give truthful mechanisms with distortion O(√m log m) for k-winner elections, and distortion O(√m log m) when candidates have arbitrary costs, in elections with m candidates. These nearly match known lower bounds for ordinal mechanisms that ignore the strategic behavior. We further tighten these lower bounds and show that for truthful mechanisms our first upper bound is tight. Lastly, when agents decide between two candidates, we give tight bounds on the distortion for truthful mechanisms.
Rank Maximal Equal Contribution: A Probabilistic Social Choice Function
Aziz, Haris (Data61, CSIRO and University of New South Wales) | Luo, Pang (Data61, CSIRO and University of New South Wales) | Rizkallah, Christine (University of Pennsylvania)
When aggregating preferences of agents via voting, two desirable goals are to incentivize agents to participate in the voting process and then identify outcomes that are Pareto efficient. We consider participation as formalized by Brandl, Brandt, and Hofbauer (2015) based on the stochastic dominance (SD) relation. We formulate a new rule called RMEC (Rank Maximal Equal Contribution) that is polynomial-time computable, ex post efficient and satisfies the strongest notion of participation. It also satisfies many other desirable fairness properties. The rule suggests a general approach to achieving very strong participation, ex post efficiency and fairness.
Norm Conflict Resolution in Stochastic Domains
Kasenberg, Daniel (Tufts University) | Scheutz, Matthias (Tufts University)
Artificial agents will need to be aware of human moral and social norms, and able to use them in decision-making. In particular, artificial agents will need a principled approach to managing conflicting norms, which are common in human social interactions. Existing logic-based approaches suffer from normative explosion and are typically designed for deterministic environments; reward-based approaches lack principled ways of determining which normative alternatives exist in a given environment. We propose a hybrid approach, using Linear Temporal Logic (LTL) representations in Markov Decision Processes (MDPs), that manages norm conflicts in a systematic manner while accommodating domain stochasticity. We provide a proof-of-concept implementation in a simulated vacuum cleaning domain.
Strategic Coalitions With Perfect Recall
Naumov, Pavel (Vassar College) | Tao, Jia (Lafayette College)
The paper proposes a bimodal logic that describes an interplay between distributed knowledge modality and coalition know-how modality. Unlike other similar systems, the one proposed here assumes perfect recall by all agents. Perfect recall is captured in the system by a single axiom. The main technical results are the soundness and the completeness theorems for the proposed logical system.
Novel Exploration Techniques (NETs) for Malaria Policy Interventions
Bent, Oliver (University of Oxford) | Remy, Sekou L. (IBM Research Africa) | Roberts, Stephen (University of Oxford) | Walcott-Bryant, Aisha (IBM Research Africa)
The task of decision-making under uncertainty is daunting, especially for problems which have significant complexity. Healthcare policy makers across the globe are facing problems under challenging constraints, with limited tools to help them make data driven decisions. In this work we frame the process of finding an optimal malaria policy as a stochastic multi-armed bandit problem, and implement three agent based strategies to explore the policy space. We apply a Gaussian Process regression to the findings of each agent, both for comparison and to account for stochastic results from simulating the spread of malaria in a fixed population. The generated policy spaces are compared with published results to give a direct reference with human expert decisions for the same simulated population. Our novel approach provides a powerful resource for policy makers, and a platform which can be readily extended to capture future more nuanced policy spaces.
MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence
Zheng, Lianmin (Shanghai Jiao Tong University) | Yang, Jiacheng (Shanghai Jiao Tong University) | Cai, Han (Shanghai Jiao Tong University ) | Zhou, Ming (Sichuan University) | Zhang, Weinan (Shanghai Jiao Tong University) | Wang, Jun (University College London) | Yu, Yong (Shanghai Jiao Tong University)
We introduce MAgent, a platform to support research and development of many-agent reinforcement learning. Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents. Within the interactions among a population of agents, it enables not only the study of learning algorithms for agents' optimal polices, but more importantly, the observation and understanding of individual agent's behaviors and social phenomena emerging from the AI society, including communication languages, leaderships, altruism. MAgent is highly scalable and can host up to one million agents on a single GPU server. MAgent also provides flexible configurations for AI researchers to design their customized environments and agents. In this demo, we present three environments designed on MAgent and show emerged collective intelligence by learning from scratch.
A Cognitive Assistant for Visualizing and Analyzing Exoplanets
Kephart, Jeffrey O. (IBM Thomas J. Watson Research Center) | Dibia, Victor C. (IBM Thomas J. Watson Research Center) | Ellis, Jason (IBM Thomas J. Watson Research Center) | Srivastava, Biplav (IBM Thomas J. Watson Research Center) | Talamadupula, Kartik (IBM Thomas J. Watson Research Center) | Dholakia, Mishal (IBM Thomas J. Watson Research Center)
We demonstrate an embodied cognitive agent that helps scientists visualize and analyze exo-planets and their host stars. The prototype is situated in a room equipped with a large display, microphones, cameras, speakers, and pointing devices. Users communicate with the agent via speech, gestures, and combinations thereof, and it responds by displaying content and generating synthesized speech.
Lifelong Learning Networks: Beyond Single Agent Lifelong Learning
Rostami, Mohammad (University of Pennsylvania) | Eaton, Eric (University of Pennsylvania)
Lifelong machine learning (LML) is a paradigm to design adaptive agents that can learn in dynamic environments. Current LML algorithms consider a single agent that has centralized access to all data. However, given privacy and security constraints, data might be distributed among multiple agents that can collaborate and learn from collective experience. Our goal is to extend LML from a single agent to a network of multiple agents that collectively learn a series of tasks.
Imitation Upper Confidence Bound for Bandits on a Graph
Lupu, Andrei (McGill University) | Precup, Doina (McGill University)
We consider a graph of interconnected agents implementing a common policy and each playing a bandit problem with identical reward distributions. We restrict the information propagated in the graph such that agents can uniquely observe each other's actions. We propose an extension of the Upper Confidence Bound (UCB) algorithm to this setting and empirically demonstrate that our solution improves the performance over UCB according to multiple metrics and within various graph configurations.