Agents
Consequentialist Conditional Cooperation in Social Dilemmas with Imperfect Information (Short Workshop Version)
Peysakhovich, Alexander (Facebook AI Research) | Lerer, Adam (Facebook AI Research)
Social dilemmas, where mutual cooperation can lead to high payoffs but participants face incentives to cheat, are ubiquitous in multi-agent interaction. We wish to construct agents that cooperate with pure cooperators, avoid exploitation by pure defectors, and incentivize cooperation from the rest. However, often the actions taken by a partner are (partially) unobserved or the consequences of individual actions are hard to predict. We show that in a large class of games good strategies can be constructed by conditioning one's behavior solely on outcomes (i.e. one's past rewards). We call this consequentialist conditional cooperation. We show how to construct such strategies using deep reinforcement learning techniques and demonstrate, both analytically and experimentally, that they are effective in social dilemmas beyond simple matrix games. We also show the limitations of relying purely on consequences and discuss the need for understanding both the consequences of and the intentions behind an action.
Planning and Learning for Decentralized MDPs with Event Driven Rewards
Gupta, Tarun (International Institute of Information Technology, Hyderabad) | Kumar, Akshat (Singapore Management University) | Paruchuri, Praveen (International Institute of Information Technology, Hyderabad)
Decentralized (PO)MDPs provide a rigorous framework for sequential multiagent decision making under uncertainty. However, their high computational complexity limits the practical impact. To address scalability and real-world impact, we focus on settings where a large number of agents primarily interact through complex joint-rewards that depend on their entire histories of states and actions. Such history-based rewards encapsulate the notion of events or tasks such that the team reward is given only when the joint-task is completed. Algorithmically, we contribute — 1) A nonlinear programming (NLP) formulation for such event-based planning model; 2) A probabilistic inference based approach that scales much better than NLP solvers for a large number of agents; 3) A policy gradient based multiagent reinforcement learning approach that scales well even for exponential state- spaces.
ITSM's next wave: AI and machine learning - ITOps Times
IT help desk technicians and administrators can't move fast enough to keep up with the flood of tickets that are coming their way these days. The good news is that help is on the way. Thanks to advances in artificial intelligence (AI), machine learning and predictive analytics, the ability to automate the resolution of IT service issues with little or no human intervention is now surfacing. Bringing AI and machine learning to facilitate self-service IT Service Management (ITSM) driven by bots, virtual agents and even using conversational computing capabilities is a common focus of the leading providers including ServiceNow, BMC Software, Micro Focus, Samanage, Atlassian and Ivanti, among many others. Several factors are now bringing AI into the ITSM equation, notably the availability of cloud-scale services, new programmable machine learning APIs that are now commercially viable and the introduction of virtual agents and bots into other customer service tools.
Wikipedia is used to give AI Common Sense Knowledge QPT
Researchers from BYU (Brigham Young University) were successful in giving common sense to the artificial intelligence agents with the help of Wikipedia. Walk into a room, see a chair, and your brain will tell you that you can sit in it, tip it over or lift it up, but you wouldn't even consider drinking it, promoting it or unlocking it. As humans, we know intuitively that certain verbs pair naturally with certain nouns, and we also know that most verbs don't make sense when paired with random nouns. Consider the monitor on your desk: you can look at it, you can turn it on, you can even pick it up or throw it, but you cannot impeach it, transpose it, justify it or correct it That intuition, for the most part, doesn't exist with computer artificial intelligence agents, who are good at identifying objects but less so in knowing what to do with them. So the BYU Research Team developed a method for teaching agents about affordances -- the set of actions that can be done with an object.
Towards Intelligent Vehicular Networks: A Machine Learning Framework
Liang, Le, Ye, Hao, Li, Geoffrey Ye
As wireless networks evolve towards high mobility and providing better support for connected vehicles, a number of new challenges arise due to the resulting high dynamics in vehicular environments and thus motive rethinking of traditional wireless design methodologies. Future intelligent vehicles, which are at the heart of high mobility networks, are increasingly equipped with multiple advanced onboard sensors and keep generating large volumes of data. Machine learning, as an effective approach to artificial intelligence, can provide a rich set of tools to exploit such data for the benefit of the networks. In this article, we first identify the distinctive characteristics of high mobility vehicular networks and motivate the use of machine learning to address the resulting challenges. After a brief introduction of the major concepts of machine learning, we discuss its applications to learn the dynamics of vehicular networks and make informed decisions to optimize network performance. In particular, we discuss in greater detail the application of reinforcement learning in managing network resources as an alternative to the prevalent optimization approach. Finally, some open issues worth further investigation are highlighted.
Trust as a Precursor to Belief Revision
Belief revision is concerned with incorporating new information into a pre-existing set of beliefs. When the new information comes from another agent, we must first determine if that agent should be trusted. In this paper, we define trust as a pre-processing step before revision. We emphasize that trust in an agent is often restricted to a particular domain of expertise. We demonstrate that this form of trust can be captured by associating a state partition with each agent, then relativizing all reports to this partition before revising. We position the resulting family of trust-sensitive revision operators within the class of selective revision operators of Ferme and Hansson, and we prove a representation result that characterizes the class of trust-sensitive revision operators in terms of a set of postulates. We also show that trust-sensitive revision is manipulable, in the sense that agents can sometimes have incentive to pass on misleading information.
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Rashid, Tabish, Samvelyan, Mikayel, de Witt, Christian Schroeder, Farquhar, Gregory, Foerster, Jakob, Whiteson, Shimon
In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised learning, but the best strategy for then extracting decentralised policies is unclear. Our solution is QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations. We structurally enforce that the joint-action value is monotonic in the per-agent values, which allows tractable maximisation of the joint action-value in off-policy learning, and guarantees consistency between the centralised and decentralised policies. We evaluate QMIX on a challenging set of StarCraft II micromanagement tasks, and show that QMIX significantly outperforms existing value-based multi-agent reinforcement learning methods.
Machine Behavior Needs to Be an Academic Discipline - Issue 58: Self
What if physiologists were the only people who study human behavior at all scales: from how the human body functions, to how social norms emerge, to how the stock market functions, to how we create, share, and consume culture? What if neuroscientists were the only people tasked with studying criminal behavior, designing educational curricula, and devising policies to fight tax evasion? Despite their growing influence on our lives, our study of AI agents is conducted this way--by a very specific group of people. Those scientists who create AI agents--namely, computer scientists and roboticists--are almost exclusively the same scientists who study the behavior of AI agents. We cannot certify that an AI agent is ethical by looking at its source code, any more than we can certify that humans are good by scanning their brains.
Distributed Constraint Optimization Problems and Applications: A Survey
Fioretto, Ferdinando, Pontelli, Enrico, Yeoh, William
The field of multi-agent system (MAS) is an active area of research within artificial intelligence, with an increasingly important impact in industrial and other real-world applications. In a MAS, autonomous agents interact to pursue personal interests and/or to achieve common objectives. Distributed Constraint Optimization Problems (DCOPs) have emerged as a prominent agent model to govern the agents' autonomous behavior, where both algorithms and communication models are driven by the structure of the specific problem. During the last decade, several extensions to the DCOP model have been proposed to enable support of MAS in complex, real-time, and uncertain environments. This survey provides an overview of the DCOP model, offering a classification of its multiple extensions and addressing both resolution methods and applications that find a natural mapping within each class of DCOPs. The proposed classification suggests several future perspectives for DCOP extensions and identifies challenges in the design of efficient resolution algorithms, possibly through the adaptation of strategies from different areas.
Artificial Intelligence and Robotics
Andreu-Perez, Javier, Deligianni, Fani, Ravi, Daniele, Yang, Guang-Zhong
The recent successes of AI have captured the wildest imagination of both the scientific communities and the general public. Robotics and AI amplify human potentials, increase productivity and are moving from simple reasoning towards human-like cognitive abilities. Current AI technologies are used in a set area of applications, ranging from healthcare, manufacturing, transport, energy, to financial services, banking, advertising, management consulting and government agencies. The global AI market is around 260 billion USD in 2016 and it is estimated to exceed 3 trillion by 2024. To understand the impact of AI, it is important to draw lessons from it's past successes and failures and this white paper provides a comprehensive explanation of the evolution of AI, its current status and future directions.