Goto

Collaborating Authors

 Agent Societies


A global collaboration to create "artificial organisms" just went live

#artificialintelligence

Mindfire, a new foundation with the goal of "decoding the mind" to help develop true artificial intelligence (AI) is launching November 17th in Zurich, Switzerland. Futurism spoke with the founder of Starmind and president of the foundation, Pascal Kaufmann to learn more about its goals and the path to reach them. "We cannot achieve True AI until we understand actual intelligence. Intelligence has evolved as a means of nature to successfully guide us through an ever-changing environment. This gave rise to behavior, emotions, and consciousness. These critical factors must be taken into account in how we develop AI. This is the purpose of the Mindfire Foundation," he explains.


Random gradient extrapolation for distributed and stochastic optimization

arXiv.org Machine Learning

In this paper, we consider a class of finite-sum convex optimization problems defined over a distributed multiagent network with $m$ agents connected to a central server. In particular, the objective function consists of the average of $m$ ($\ge 1$) smooth components associated with each network agent together with a strongly convex term. Our major contribution is to develop a new randomized incremental gradient algorithm, namely random gradient extrapolation method (RGEM), which does not require any exact gradient evaluation even for the initial point, but can achieve the optimal ${\cal O}(\log(1/\epsilon))$ complexity bound in terms of the total number of gradient evaluations of component functions to solve the finite-sum problems. Furthermore, we demonstrate that for stochastic finite-sum optimization problems, RGEM maintains the optimal ${\cal O}(1/\epsilon)$ complexity (up to a certain logarithmic factor) in terms of the number of stochastic gradient computations, but attains an ${\cal O}(\log(1/\epsilon))$ complexity in terms of communication rounds (each round involves only one agent). It is worth noting that the former bound is independent of the number of agents $m$, while the latter one only linearly depends on $m$ or even $\sqrt m$ for ill-conditioned problems. To the best of our knowledge, this is the first time that these complexity bounds have been obtained for distributed and stochastic optimization problems. Moreover, our algorithms were developed based on a novel dual perspective of Nesterov's accelerated gradient method.


Bounty Hunting and Human-Agent Group Task Allocation

AAAI Conferences

Much research has been done to apply auctions, markets, and negotiation mechanisms to solve the multiagent task allocation problem. However, there has been very little work on human-agent group task allocation. We believe that the notion of bounty hunting has good properties for human-agent group interaction in dynamic task allocation problems. We use previous experimental results comparing bounty hunting with auction-like methods to argue why it would be particularly adept at handling scenarios with unreliable collaborators and unexpectedly hard tasks: scenarios we believe highlight difficulties involved in working with humans collaborators.


Social Simulation for Social Justice

AAAI Conferences

We argue that social simulation can help us understand social justice issues. In particular, modeling certain social dynamics within computational systems can be used to creatively explore and better understand the social and identity dynamics of oppression. Writing theories of oppression in code forces us to explicate everything, and question what we leave out or what we can’t account for. As an early step in this direction, we present an in-progress social simulation of group discussion in activist meetings, developed in the already-existing AI system, Ensemble. Through this minimal, highly constrained social arena, we can explore wide-reaching phenomena like privilege, intersectionality, and power dynamics in nonhierarchical groups, but in a way that’s grounded in concrete, person-to-person interactions. We propose that this kind of social simulation can aid in the process of unlearning hegemonic ways of being, and imagining liberatory alternatives.


Guided Deep Reinforcement Learning for Swarm Systems

arXiv.org Machine Learning

In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.


A Discrete and Bounded Envy-Free Cake Cutting Protocol for Any Number of Agents

arXiv.org Artificial Intelligence

We consider the well-studied cake cutting problem in which the goal is to find an envy-free allocation based on queries from $n$ agents. The problem has received attention in computer science, mathematics, and economics. It has been a major open problem whether there exists a discrete and bounded envy-free protocol. We resolve the problem by proposing a discrete and bounded envy-free protocol for any number of agents. The maximum number of queries required by the protocol is $n^{n^{n^{n^{n^n}}}}$. We additionally show that even if we do not run our protocol to completion, it can find in at most $n^3{(n^2)}^n$ queries a partial allocation of the cake that achieves proportionality (each agent gets at least $1/n$ of the value of the whole cake) and envy-freeness. Finally we show that an envy-free partial allocation can be computed in at most $n^3{(n^2)}^n$ queries such that each agent gets a connected piece that gives the agent at least $1/(3n)$ of the value of the whole cake.


Spacetimes with Semantics (III) - The Structure of Functional Knowledge Representation and Artificial Reasoning

arXiv.org Artificial Intelligence

Using the previously developed concepts of semantic spacetime, I explore the interpretation of knowledge representations, and their structure, as a semantic system, within the framework of promise theory. By assigning interpretations to phenomena, from observers to observed, we may approach a simple description of knowledge-based functional systems, with direct practical utility. The focus is especially on the interpretation of concepts, associative knowledge, and context awareness. The inference seems to be that most if not all of these concepts emerge from purely semantic spacetime properties, which opens the possibility for a more generalized understanding of what constitutes a learning, or even `intelligent' system. Some key principles emerge for effective knowledge representation: 1) separation of spacetime scales, 2) the recurrence of four irreducible types of association, by which intent propagates: aggregation, causation, cooperation, and similarity, 3) the need for discrimination of identities (discrete), which is assisted by distinguishing timeline simultaneity from sequential events, and 4) the ability to learn (memory). It is at least plausible that emergent knowledge abstraction capabilities have their origin in basic spacetime structures. These notes present a unified view of mostly well-known results; they allow us to see information models, knowledge representations, machine learning, and semantic networking (transport and information base) in a common framework. The notion of `smart spaces' thus encompasses artificial systems as well as living systems, across many different scales, e.g. smart cities and organizations.


Perturbation Training for Human-Robot Teams

Journal of Artificial Intelligence Research

In this work, we design and evaluate a computational learning model that enables a human-robot team to co-develop joint strategies for performing novel tasks that require coordination. The joint strategies are learned through "perturbation training," a human team-training strategy that requires team members to practice variations of a given task to help their team generalize to new variants of that task. We formally define the problem of human-robot perturbation training and develop and evaluate the first end-to-end framework for such training, which incorporates a multi-agent transfer learning algorithm, human-robot co-learning framework and communication protocol. Our transfer learning algorithm, Adaptive Perturbation Training (AdaPT), is a hybrid of transfer and reinforcement learning techniques that learns quickly and robustly for new task variants. We empirically validate the benefits of AdaPT through comparison to other hybrid reinforcement and transfer learning techniques aimed at transferring knowledge from multiple source tasks to a single target task. We also demonstrate that AdaPT's rapid learning supports live interaction between a person and a robot, during which the human-robot team trains to achieve a high level of performance for new task variants. We augment AdaPT with a co-learning framework and a computational bi-directional communication protocol so that the robot can co-train with a person during live interaction. Results from large-scale human subject experiments (n=48) indicate that AdaPT enables an agent to learn in a manner compatible with a human's own learning process, and that a robot undergoing perturbation training with a human results in a high level of team performance. Finally, we demonstrate that human-robot training using AdaPT in a simulation environment produces effective performance for a team incorporating an embodied robot partner.


Privacy Preserving Implementation of the Max-Sum Algorithm and its Variants

Journal of Artificial Intelligence Research

One of the basic motivations for solving DCOPs is maintaining agents' privacy. Thus, researchers have evaluated the privacy loss of DCOP algorithms and defined corresponding notions of privacy preservation for secured DCOP algorithms. However, no secured protocol was proposed for Max-Sum, which is among the most studied DCOP algorithms. As part of the ongoing effort of designing secure DCOP algorithms, we propose P-Max-Sum, the first private algorithm that is based on Max-Sum. The proposed algorithm has multiple agents preforming the role of each node in the factor graph, on which the Max-Sum algorithm operates. P-Max-Sum preserves three types of privacy: topology privacy, constraint privacy, and assignment/decision privacy. By allowing a single call to a trusted coordinator, P-Max-Sum also preserves agent privacy. The two main cryptographic means that enable this privacy preservation are secret sharing and homomorphic encryption. In addition, we design privacy-preserving implementations of four variants of Max-Sum. We conclude by analyzing the price of privacy in terns of runtime overhead, both theoretically and by extensive experimentation.


Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability

arXiv.org Artificial Intelligence

Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring teammates. Approaches that learn specialized policies for individual tasks face problems when applied to the real world: not only do agents have to learn and store distinct policies for each task, but in practice identities of tasks are often non-observable, making these approaches inapplicable. This paper formalizes and addresses the problem of multi-task multi-agent reinforcement learning under partial observability. We introduce a decentralized single-task learning approach that is robust to concurrent interactions of teammates, and present an approach for distilling single-task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity.