Agents
Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences
Fiez, Tanner, Sekar, Shreyas, Zheng, Liyuan, Ratliff, Lillian J.
The design of personalized incentives or recommendations to improve user engagement is gaining prominence as digital platform providers continually emerge. We propose a multi-armed bandit framework for matching incentives to users, whose preferences are unknown a priori and evolving dynamically in time, in a resource constrained environment. We design an algorithm that combines ideas from three distinct domains: (i) a greedy matching paradigm, (ii) the upper confidence bound algorithm (UCB) for bandits, and (iii) mixing times from the theory of Markov chains. For this algorithm, we provide theoretical bounds on the regret and demonstrate its performance via both synthetic and realistic (matching supply and demand in a bike-sharing platform) examples.
Multi-robot Path Planning in Well-formed Infrastructures: Prioritized Planning vs. Prioritized Wait Adjustment (Preliminary Results)
Andreychuk, Anton, Yakovlev, Konstantin
We study the problem of planning collision-free paths for a group of homogeneous robots. We propose a novel approach for turning the paths that were planned egocentrically by the robots, e.g. without taking other robots' moves into account, into collision-free trajectories and evaluate it empirically. Suggested algorithm is much faster (up to one order of magnitude) than state-of-the-art but this comes at the price of notable drop-down of the solution cost.
DeepMind AI's new trick is playing 'Quake III Arena' like a human
The team focused on a capture the flag mode, one in which the map changes from match to match. Its AI agents had to learn general strategies to be able to adapt to each new map, something humans do easily. The agents also needed to both cooperate with team members as well as compete against the opposite team, and be able to adjust to different enemy play styles. "Our agents must learn from scratch how to see, act, cooperate, and compete in unseen environments, all from a single reinforcement signal per match: whether their team won or not," wrote the researchers in a blog post. They trained a population of AI-powered agents that learn by playing the game, much like we do.
The Power of Verification for Greedy Mechanism Design
Fotakis, Dimitris, Krysta, Piotr, Ventre, Carmine
Greedy algorithms are known to provide, in polynomial time, near optimal approximation guarantees for Combinatorial Auctions (CAs) with multidimensional bidders. It is known that truthful greedy-like mechanisms for CAs with multi-minded bidders do not achieve good approximation guarantees. In this work, we seek a deeper understanding of greedy mechanism design and investigate under which general assumptions, we can have efficient and truthful greedy mechanisms for CAs. Towards this goal, we use the framework of priority algorithms and weak and strong verification, where the bidders are not allowed to overbid on their winning set or on any subset of this set, respectively. We provide a complete characterization of the power of weak verification showing that it is sufficient and necessary for any greedy fixed priority algorithm to become truthful with the use of money or not, depending on the ordering of the bids. Moreover, we show that strong verification is sufficient and necessary to obtain a 2-approximate truthful mechanism with money, based on a known greedy algorithm, for the problem of submodular CAs in finite bidding domains. Our proof is based on an interesting structural analysis of the strongly connected components of the declaration graph.
DeepMind AI's new trick is playing 'Quake III Arena' like a human
Research in AI continues to make video games better. The technology informs NPCs that can move and fight more convincingly, orcs with personalities and ever-more realistic visuals. Now researchers at DeepMind have taught an AI to play a customized version of Quake III Arena like a human. The team focused on a capture the flag mode, one in which the map changes from match to match. Its AI agents had to learn general strategies to be able to adapt to each new map, something humans do easily.
Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
Jaderberg, Max, Czarnecki, Wojciech M., Dunning, Iain, Marris, Luke, Lever, Guy, Castaneda, Antonio Garcia, Beattie, Charles, Rabinowitz, Neil C., Morcos, Ari S., Ruderman, Avraham, Sonnerat, Nicolas, Green, Tim, Deason, Louise, Leibo, Joel Z., Silver, David, Hassabis, Demis, Kavukcuoglu, Koray, Graepel, Thore
Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games. However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open challenge. In this work, we demonstrate for the first time that an agent can achieve human-level in a popular 3D multiplayer first-person video game, Quake III Arena Capture the Flag, using only pixels and game points as input. These results were achieved by a novel two-tier optimisation process in which a population of independent RL agents are trained concurrently from thousands of parallel matches with agents playing in teams together and against each other on randomly generated environments. Each agent in the population learns its own internal reward signal to complement the sparse delayed reward from winning, and selects actions using a novel temporally hierarchical representation that enables the agent to reason at multiple timescales. During game-play, these agents display human-like behaviours such as navigating, following, and defending based on a rich learned representation that is shown to encode high-level game knowledge. In an extensive tournament-style evaluation the trained agents exceeded the win-rate of strong human players both as teammates and opponents, and proved far stronger than existing state-of-the-art agents. These results demonstrate a significant jump in the capabilities of artificial agents, bringing us closer to the goal of human-level intelligence.
Can Artificial Intelligence End Your Video Buffering Problems? - Muvi
Currently, we stand on the brink of a fourth Industrial revolution. Artificial Intelligence or AI is the intelligence demonstrated by machines for performing tasks. It is a specialized section of computer science which focuses on creating intelligent machines that think, react and work like humans. Some of the activities computers with artificial intelligence are built for includes problem solving, learning, analysing, speech recognition, and much more. AI is a vast field in itself, and it encompasses a wide spectrum of technologies such as Machine Learning, Automated Intelligence System, Deep learning, Neural Network, Computational Argumentation, and Multi-agent Systems to solve problems that currently seem impossible.
Path Finding for the Coalition of Co-operative Agents Acting in the Environment with Destructible Obstacles
Andreychuk, Anton, Yakovlev, Konstantin
The problem of planning a set of paths for the coalition of robots (agents) with different capabilities is considered in the paper. Some agents can modify the environment by destructing the obstacles thus allowing the other ones to shorten their paths to the goal. As a result the mutual solution of lower cost, e.g. time to completion, may be acquired. We suggest an original procedure to identify the obstacles for further removal that can be embedded into almost any heuristic search planner (we use Theta*) and evaluate it empirically. Results of the evaluation show that time-to-complete the mission can be decreased up to 9-12 % by utilizing the proposed technique.
Reports of the AAAI 2017 Fall Symposium Series
Flenner, Arjuna (NAVAIR China Lake) | Fraune, Marlena R. (Indiana University) | Hiatt, Laura M. (Naval Research Laboratory (NRL)) | Kendall, Tony (Naval Postgraduate School) | Laird, John E. (University of Michigan) | Lebiere, Christian (Carnegie Mellon University) | Rosenbloom, Paul S. (Institute for Creative Technologies, University of Southern California) | Stein, Frank (IBM) | Topp, Elin A. (Lund University) | Unhelkar, Vaibhav V. (Massachusetts Institute of Technology) | Zhao, Ying (Naval Postgraduate School)
The AAAI 2017 Fall Symposium Series was held Thursday through Saturday, November 9–11, at the Westin Arlington Gateway in Arlington, Virginia, adjacent to Washington, DC. The titles of the six symposia were Artificial Intelligence for Human-Robot Interaction; Cognitive Assistance in Government and Public Sector Applications; Deep Models and Artificial Intelligence for Military Applications: Potentials, Theories, Practices, Tools and Risks; Human-Agent Groups: Studies, Algorithms and Challenges; Natural Communication for Human-Robot Collaboration; and A Standard Model of the Mind. The highlights of each symposium (except the Natural Communication for Human-Robot Collaboration symposium, whose organizers did not submit a report) are presented in this report.
Goal Reasoning: Foundations, Emerging Applications, and Prospects
Goal reasoning (GR) has a bright future as a foundation for the research and development of intelligent agents. GR is the study of agents that can deliberate on and self-select their goals/objectives, which is a desirable capability for some applications of deliberative autonomy. While studied in diverse AI sub-communities for multiple applications, our group has focused on how GR can play a key role for controlling autonomous systems. Thus, its importance is rapidly growing and it merits increased attention, particularly from the perspective of research on AI safety. In this article, I introduce GR, briefly relate it to other AI topics, summarize some of our group’s work on GR foundations and emerging applications, and describe some current and future research directions.