Goto

Collaborating Authors

 Agents


Unsupervised Predictive Memory in a Goal-Directed Agent

arXiv.org Machine Learning

Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement learning (RL) algorithms with deep neural networks, and the excitement surrounding these results has led to the pursuit of related ideas as explanations of non-human animal learning. However, we demonstrate that contemporary RL algorithms struggle to solve simple tasks when enough information is concealed from the sensors of the agent, a property called "partial observability". An obvious requirement for handling partially observed tasks is access to extensive memory, but we show memory is not enough; it is critical that the right information be stored in the right format. We develop a model, the Memory, RL, and Inference Network (MERLIN), in which memory formation is guided by a process of predictive modeling. MERLIN facilitates the solution of tasks in 3D virtual reality environments for which partial observability is severe and memories must be maintained over long durations. Our model demonstrates a single learning agent architecture that can solve canonical behavioural tasks in psychology and neurobiology without strong simplifying assumptions about the dimensionality of sensory input or the duration of experiences.



AAAI News

AI Magazine

Recently, AAAI coordinated and The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) cosigned a statement with CRA, and the Thirty-First Conference on Innovative Applications of Artificial expressing concern about the proposed Intelligence (IAAI-19), will be held in Honolulu, Hawaii, USA, January tax bill and its ramifications for graduate 27 - February 1, 2019. The technical conference will continue its student stipends. Other organizational 3.5-day schedule, preceded by the workshop and tutorial programs.


The Taboo Challenge Competition

AI Magazine

Games have always been a popular domain of AI research, and they have been used for many recent competitions. However, reaching human-level performance often either focuses on comprehensive world knowledge or solving decision-making problems with unmanageable solution spaces. Building on the popular Taboo board game, the Taboo Challenge Competition addresses a different problem โ€” that of bridging the gap between the domain knowledge of heterogeneous agents trying to jointly identify a concept without making reference to its most salient features. The competition, which was run for the first time at IJCAI 2017, aims to provide a simple testbed for diversity-aware AI where the focus is on integrating independently engineered AI components, while offering a scenario that is challenging yet simple enough to not require mastering general commonsense knowledge or natural language understanding. We describe the design and preparation of the competition, discuss results, and lessons learned.


Robust Decentralized Learning Using ADMM with Unreliable Agents

arXiv.org Machine Learning

Many machine learning problems can be formulated as consensus optimization problems which can be solved efficiently via a cooperative multi-agent system. However, the agents in the system can be unreliable due to a variety of reasons: noise, faults and attacks. Thus, providing falsified data leads the optimization process in a wrong direction, and degrades the performance of distributed machine learning algorithms. This paper considers the problem of decentralized learning using ADMM in the presence of unreliable agents. First, we rigorously analyze the effect of falsified updates (in ADMM learning iterations) on the convergence behavior of multi-agent system. We show that the algorithm linearly converges to a neighborhood of the optimal solution under certain conditions and characterize the neighborhood size analytically. Next, we provide guidelines for network structure design to achieve a faster convergence. We also provide necessary conditions on the falsified updates for exact convergence to the optimal solution. Finally, to mitigate the influence of unreliable agents, we propose a robust variant of ADMM and show its resilience to unreliable agents.


Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings

arXiv.org Machine Learning

With the advent of sequential matching (of supply and demand) systems (uber, Lyft, Grab for taxis; ubereats, deliveroo, etc for food; amazon prime, lazada etc. for groceries) across many online and offline services, individuals (taxi drivers, delivery boys, delivery van drivers, etc.) earn more by being at the "right" place at the "right" time. We focus on learning techniques for providing guidance (on right locations to be at right times) to individuals in the presence of other "learning" individuals. Interactions between indivduals are anonymous, i.e, the outcome of an interaction (competing for demand) is independent of the identity of the agents and therefore we refer to these as Anonymous MARL settings. Existing research of relevance is on independent learning using Reinforcement Learning (RL) or on Multi-Agent Reinforcement Learning (MARL). The number of individuals in aggregation systems is extremely large and individuals have their own selfish interest (of maximising revenue). Therefore, traditional MARL approaches are either not scalable or assumptions of common objective or action coordination are not viable. In this paper, we focus on improving performance of independent reinforcement learners, specifically the popular Deep Q-Networks (DQN) and Advantage Actor Critic (A2C) approaches by exploiting anonymity. Specifically, we control non-stationarity introduced by other agents using entropy of agent density distribution. We demonstrate a significant improvement in revenue for individuals and for all agents together with our learners on a generic experimental set up for aggregation systems and a real world taxi dataset.


A Dynamic-Adversarial Mining Approach to the Security of Machine Learning

arXiv.org Machine Learning

Operating in a dynamic real world environment requires a forward thinking and adversarial aware design for classifiers, beyond fitting the model to the training data. In such scenarios, it is necessary to make classifiers - a) harder to evade, b) easier to detect changes in the data distribution over time, and c) be able to retrain and recover from model degradation. While most works in the security of machine learning has concentrated on the evasion resistance (a) problem, there is little work in the areas of reacting to attacks (b and c). Additionally, while streaming data research concentrates on the ability to react to changes to the data distribution, they often take an adversarial agnostic view of the security problem. This makes them vulnerable to adversarial activity, which is aimed towards evading the concept drift detection mechanism itself. In this paper, we analyze the security of machine learning, from a dynamic and adversarial aware perspective. The existing techniques of Restrictive one class classifier models, Complex learning models and Randomization based ensembles, are shown to be myopic as they approach security as a static task. These methodologies are ill suited for a dynamic environment, as they leak excessive information to an adversary, who can subsequently launch attacks which are indistinguishable from the benign data. Based on empirical vulnerability analysis against a sophisticated adversary, a novel feature importance hiding approach for classifier design, is proposed. The proposed design ensures that future attacks on classifiers can be detected and recovered from. The proposed work presents motivation, by serving as a blueprint, for future work in the area of Dynamic-Adversarial mining, which combines lessons learned from Streaming data mining, Adversarial learning and Cybersecurity.


Handling Adversarial Concept Drift in Streaming Data

arXiv.org Machine Learning

Classifiers operating in a dynamic, real world environment, are vulnerable to adversarial activity, which causes the data distribution to change over time. These changes are traditionally referred to as concept drift, and several approaches have been developed in literature to deal with the problem of drift handling and detection. However, most concept drift handling techniques, approach it as a domain independent task, to make them applicable to a wide gamut of reactive systems. These techniques were developed from an adversarial agnostic perspective, where they are naive and assume that drift is a benign change, which can be fixed by updating the model. However, this is not the case when an active adversary is trying to evade the deployed classification system. In such an environment, the properties of concept drift are unique, as the drift is intended to degrade the system and at the same time designed to avoid detection by traditional concept drift detection techniques. This special category of drift is termed as adversarial drift, and this paper analyzes its characteristics and impact, in a streaming environment. A novel framework for dealing with adversarial concept drift is proposed, called the Predict-Detect streaming framework. Experimental evaluation of the framework, on generated adversarial drifting data streams, demonstrates that this framework is able to provide reliable unsupervised indication of drift, and is able to recover from drifts swiftly. While traditional partially labeled concept drift detection methodologies fail to detect adversarial drifts, the proposed framework is able to detect such drifts and operates with <6% labeled data, on average. Also, the framework provides benefits for active learning over imbalanced data streams, by innately providing for feature space honeypots, where minority class adversarial samples may be captured.


Generative Multi-Agent Behavioral Cloning

arXiv.org Machine Learning

We propose and study the problem of generative multi-agent behavioral cloning, where the goal is to learn a generative multi-agent policy from pre-collected demonstration data. Building upon advances in deep generative models, we present a hierarchical policy framework that can tractably learn complex mappings from input states to distributions over multi-agent action spaces. Our framework is flexible and can incorporate high-level domain knowledge into the structure of the underlying deep graphical model. For instance, we can effectively learn low-dimensional structures, such as long-term goals and team coordination, from data. Thus, an additional benefit of our hierarchical approach is the ability to plan over multiple time scales for effective long-term planning. We showcase our approach in an application of modeling team offensive play from basketball tracking data. We show how to instantiate our framework to effectively model complex interactions between basketball players and generate realistic multi-agent trajectories of basketball gameplay over long time periods. We validate our approach using both quantitative and qualitative evaluations, including a user study comparison conducted with professional sports analysts.


Meet the company trying to merge the human brain and A.I. to predict real-world events

#artificialintelligence

Rather than being fearful of machines rising up against humans, one company is actively trying to merge the two, by combining human intelligence with computer algorithms to predict a whole series of real-world events. Unanimous AI is a company that uses technology that draws from a concept commonly found in nature: swarm intelligence. Rather than using algorithms to replace human intelligence, the firm tries to amplify it. "The artificial swarm intelligence really refers to the way in which we actually combine humans with technology in order to come to these amplified outsets, or amplified outcomes," David Baltaxe, chief intelligence officer at Unanimous AI, said on Tuesday. Biologists and zoologists have been studying swarm intelligence in systems of insects and animals, like fishes, birds and honeybees, for a long period of time, Baltaxe told CNBC at the Credit Suisse Asian Investment Conference.