Agents
Bug brains help AI solve navigation challenges
Drones and other autonomous robots require mobile and efficient solutions to real-life issues, from mundane package transportation to urgent search and rescue missions. Using machine learning and a vector-based navigation system inspired by insects, agents could navigate to key locations without relying on a GPS -- becoming truly autonomous. Robots could learn to navigate independently to wildfires based on environmental sensory cues, using information from cameras and other sensors. Since vectors are represented in a geocentric context, multiple agents could communicate locations with each other, which could, for example, speed up efforts to perform rescues and put out fires. Such flexibility and speed of coordination would largely improve the success and efficiency of rescue missions during natural disasters -- and save lives.
Discrete-Time Polar Opinion Dynamics with Susceptibility
Liu, Ji, Ye, Mengbin, Anderson, Brian D. O., Başar, Tamer, Nedić, Angelia
This paper considers a discrete-time opinion dynamics model in which each individual's susceptibility to being influenced by others is dependent on her current opinion. We assume that the social network has time-varying topology and that the opinions are scalars on a continuous interval. We first propose a general opinion dynamics model based on the DeGroot model, with a general function to describe the functional dependence of each individual's susceptibility on her own opinion, and show that this general model is analogous to the Friedkin-Johnsen model, which assumes a constant susceptibility for each individual. We then consider two specific functions in which the individual's susceptibility depends on the \emph{polarity} of her opinion, and provide motivating social examples. First, we consider stubborn positives, who have reduced susceptibility if their opinions are at one end of the interval and increased susceptibility if their opinions are at the opposite end. A court jury is used as a motivating example. Second, we consider stubborn neutrals, who have reduced susceptibility when their opinions are in the middle of the spectrum, and our motivating examples are social networks discussing established social norms or institutionalized behavior. For each specific susceptibility model, we establish the initial and graph topology conditions in which consensus is reached, and develop necessary and sufficient conditions on the initial conditions for the final consensus value to be at either extreme of the opinion interval. Simulations are provided to show the effects of the susceptibility function when compared to the DeGroot model.
Anthropic decision theory
This paper sets out to resolve how agents ought to act in the Sleeping Beauty problem and various related anthropic (self-locating belief) problems, not through the calculation of anthropic probabilities, but through finding the correct decision to make. It creates an anthropic decision theory (ADT) that decides these problems from a small set of principles. By doing so, it demonstrates that the attitude of agents with regards to each other (selfish or altruistic) changes the decisions they reach, and that it is very important to take this into account. To illustrate ADT, it is then applied to two major anthropic problems and paradoxes, the Presumptuous Philosopher and Doomsday problems, thus resolving some issues about the probability of human extinction.
Introducing: Unity Machine Learning Agents – Unity Blog
Our two previous blog entries implied that there is a role games can play in driving the development of Reinforcement Learning algorithms. As the world's most popular creation engine, Unity is at the crossroads between machine learning and gaming. It is critical to our mission to enable machine learning researchers with the most powerful training scenarios, and for us to give back to the gaming community by enabling them to utilize the latest machine learning technologies. As the first step in this endeavor, we are excited to introduce Unity Machine Learning Agents. Machine Learning is changing the way we expect to get intelligent behavior out of autonomous agents.
Augment raises $5 million to help customer service agents with AI
Augment today announced it has raised $5 million for an AI platform that assists customer service agents at large companies. The startup had operated in stealth for 10 months prior to launch. The company joins competitors like Mattersight, DigitalGenius, LivePerson, and others in its efforts to train AI using conversations between customers and businesses in order to better guide customer service agents. The money will be used to bolster the Augment AI platform, which is trained by an aggregated dataset made up of 100 million conversational interactions at large companies, including Dyson. Augment makes no attempt to replace human agents, only to make them more efficient.
Game Theory for Data Science: Eliciting Truthful Information
Faltings, Boi, Radanovic, Goran
Intelligent systems often depend on data provided by information agents, for example, sensor data or crowdsourced human computation. Providing accurate and relevant data requires costly effort that agents may not always be willing to provide. Thus, it becomes important not only to verify the correctness of data, but also to provide incentives so that agents that provide high-quality data are rewarded while those that do not are discouraged by low rewards. We cover different settings and the assumptions they admit, including sensing, human computation, peer grading, reviews, and predictions. We survey different incentive mechanisms, including proper scoring rules, prediction markets and peer prediction, Bayesian Truth Serum, Peer Truth Serum, Correlated Agreement, and the settings where each of them would be suitable.
The 10 Algorithms Machine Learning Engineers Need to Know
Some of the most common examples of machine learning are Netflix's algorithms to make movie suggestions based on movies you have watched in the past or Amazon's algorithms that recommend books based on books you have bought before. The textbook that we used is one of the AI classics: Peter Norvig's Artificial Intelligence -- A Modern Approach, in which we covered major topics including intelligent agents, problem-solving by searching, adversarial search, probability theory, multi-agent systems, social AI, philosophy/ethics/future of AI. Machine learning algorithms can be divided into 3 broad categories -- supervised learning, unsupervised learning, and reinforcement learning.Supervised learning is useful in cases where a property (label) is available for a certain dataset (training set), but is missing and needs to be predicted for other instances. You can think of linear regression as the task of fitting a straight line through a set of points.
DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self
Moulin-Frier, Clément, Fischer, Tobias, Petit, Maxime, Pointeau, Grégoire, Puigbo, Jordi-Ysard, Pattacini, Ugo, Low, Sock Ching, Camilleri, Daniel, Nguyen, Phuong, Hoffmann, Matej, Chang, Hyung Jin, Zambelli, Martina, Mealier, Anne-Laure, Damianou, Andreas, Metta, Giorgio, Prescott, Tony J., Demiris, Yiannis, Dominey, Peter Ford, Verschure, Paul F. M. J.
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.
Guided Deep Reinforcement Learning for Swarm Systems
Hüttenrauch, Maximilian, Šošić, Adrian, Neumann, Gerhard
In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.
AI Uses Less Than Two Minutes of Videogame Footage to Recreate Game Engine
Game studios and enthusiasts may soon have a new tool at their disposal to speed up game development and experiment with different styles of play. Georgia Institute of Technology researchers have developed a new approach using an artificial intelligence to learn a complete game engine, the basic software of a game that governs everything from character movement to rendering graphics. Their AI system watches less than two minutes of gameplay video and then builds its own model of how the game operates by studying the frames and making predictions of future events, such as what path a character will choose or how enemies might react. To get their AI agent to create an accurate predictive model that could account for all the physics of a 2D platform-style game, the team trained the AI on a single "speedrunner" video, where a player heads straight for the goal. This made "the training problem for the AI as difficult as possible."