Goto

Collaborating Authors

 Agents


Prediction problems inspired by animal learning

arXiv.org Artificial Intelligence

We present three problems modeled after animal learning experiments designed to test online state construction or representation learning algorithms. Our test problems require the learning system to construct compact summaries of their past interaction with the world in order to predict the future, updating online and incrementally on each time step without an explicit training-testing split. The majority of recent work in Deep Reinforcement Learning focuses on either fully observable tasks, or games where stacking a handful of recent frames is sufficient for good performance. Current benchmarks used for evaluating memory and recurrent learning make use of 3D visual environments (e.g., DeepMind Lab) which require billions of training samples, complex agent architectures, and cloud-scale compute. These domains are thus not well suited for rapid prototyping, hyper-parameter study, or extensive replication study. In this paper, we contribute a set of test problems and benchmark results to fill this gap. Our test problems are designed to be the simplest instantiation and test of learning capabilities which animals readily exhibit, including (1) trace conditioning (remembering a cue in order to predict another far in the future), (2) patterning (a particular combination of cues predict another), (3) and combinations of both with additional non-relevant distracting signals. We provide baselines for each problem including heuristics from the early days of neural network learning and simple ideas inspired by computational models of animal learning. Our results highlight the difficulty of our test problems for online recurrent learning systems and how the agent's performance often exhibits substantial sensitivity to the choice of key problem and agent parameters.


Multi-Agent Decentralized Belief Propagation on Graphs

arXiv.org Artificial Intelligence

We consider the problem of interactive partially observable Markov decision processes (I-POMDPs), where the agents are located at the nodes of a communication network. Specifically, we assume a certain message type for all messages. Moreover, each agent makes individual decisions based on the interactive belief states, the information observed locally and the messages received from its neighbors over the network. Within this setting, the collective goal of the agents is to maximize the globally averaged return over the network through exchanging information with their neighbors. We propose a decentralized belief propagation algorithm for the problem, and prove the convergence of our algorithm. Finally we show multiple applications of our framework. Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.


Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

arXiv.org Artificial Intelligence

In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, partial state observations, and a multiagent structure. We discuss and compare algorithms that simultaneously or sequentially optimize the agents' controls by using multistep lookahead, truncated rollout with a known base policy, and a terminal cost function approximation. Our methods specifically address the computational challenges of partially observable multiagent problems. In particular: 1) We consider rollout algorithms that dramatically reduce required computation while preserving the key cost improvement property of the standard rollout method. The per-step computational requirements for our methods are on the order of $O(Cm)$ as compared with $O(C^m)$ for standard rollout, where $C$ is the maximum cardinality of the constraint set for the control component of each agent, and $m$ is the number of agents. 2) We show that our methods can be applied to challenging problems with a graph structure, including a class of robot repair problems whereby multiple robots collaboratively inspect and repair a system under partial information. 3) We provide a simulation study that compares our methods with existing methods, and demonstrate that our methods can handle larger and more complex partially observable multiagent problems (state space size $10^{37}$ and control space size $10^{7}$, respectively). Finally, we incorporate our multiagent rollout algorithms as building blocks in an approximate policy iteration scheme, where successive rollout policies are approximated by using neural network classifiers. While this scheme requires a strictly off-line implementation, it works well in our computational experiments and produces additional significant performance improvement over the single online rollout iteration method.


Two motivational artificial beings are better than one for enhancing learning: Researchers find that praise delivered by robots and virtual agents improves offline learning

#artificialintelligence

In a study published this month in PLOS ONE, researchers from the University of Tsukuba have shown that motor task performance in participants was significantly enhanced by praise from either one or two robots or virtual agents. Although praise from robots and virtual agents has been found to enhance human motivation and performance during a task, whether these interactions have similar effects on offline skill consolidation, which is an essential component of the learning process, has not been investigated. Further, the various conditions associated with the delivery of praise by robot and virtual agents have not been thoroughly explored previously. The researchers at the University of Tsukuba aimed to address these questions in the present study. "Previous studies have shown that praise from others can positively affect offline improvements in human motor skills," says first author Masahiro Shiomi. "However, whether praise from artificial beings can have similar effects on offline improvements has not been explored previously."


Bounded Risk-Sensitive Markov Game and Its Inverse Reward Learning Problem

arXiv.org Machine Learning

Classical game-theoretic approaches for multi-agent systems in both the forward policy design problem and the inverse reward learning problem often make strong rationality assumptions: agents perfectly maximize expected utilities under uncertainties. Such assumptions, however, substantially mismatch with observed humans' behaviors such as satisficing with sub-optimal, risk-seeking, and loss-aversion decisions. In this paper, we investigate the problem of bounded risk-sensitive Markov Game (BRSMG) and its inverse reward learning problem. {Drawing on iterative reasoning models and cumulative prospect theory, we embrace that humans have bounded intelligence and maximize risk-sensitive utilities in BRSMGs.} Convergence analysis for both the forward policy design and the inverse reward learning problems are established under the BRSMG framework. We also validate the proposed forward policy design and inverse reward learning algorithms in a navigation scenario. The results show that the behaviors of agents demonstrate both risk-averse and risk-seeking characteristics. Moreover, in the inverse reward learning task, the proposed bounded risk-sensitive inverse learning algorithm outperforms a baseline risk-neutral inverse learning algorithm by effectively recovering not only more accurate reward values but also the intelligence levels and the risk-measure parameters given demonstrations of agents' interactive behaviors.


Multi-Agent Reinforcement Learning in Time-varying Networked Systems

arXiv.org Machine Learning

In comparison to single-agent reinforcement learning (RL), MARL poses many challenges, chief of which is scalability [49]. Even if each agent's local state/action spaces are small, the size of the global state/action space can be large, potentially exponentially large in the number of agents, which renders many RL algorithms such as -learning not applicable. A promising approach for addressing the scalability challenge that has received attention in recent years is to exploit application-specific structures, e.g., [16, 32, 35]. A particularly important example of such a structure is a networked structure, e.g., applications in multi-agent networked systems such as social networks [6, 24], communication networks [44, 52], queueing networks [31], and smart transportation networks [51]. In these networked systems, it is often possible to exploit static, local dependency structures [1, 14, 15, 29], e.g., the fact that agents only interact with a fixed set of neighboring agents throughout the game. This sort of dependency structure often leads to scalable, distributed algorithms for optimization and control [1, 14, 29], and has proven effective for designing scalable and distributed MARL algorithms, e.g.


Evolution of Artificial Intelligent Plane

arXiv.org Artificial Intelligence

Networks are evolving to meet user demands. Main qualities which make conventional internet successful are heterogeneity and generality combining with user transparency and rich functionality for end-to-end systems. In today's world networks display characteristics of unstable convoluted systems. Till date most networks are murky to its applications and providing only best effort delivery of packets with little or zero information about the reliability and performance characteristics of different paths. Granting, this design works well for simple server-client model, many emerging technologies such as: NFV (Network Function Virtualization [8], IoT (Internet of Things) [9], Software Defined Networking [10], CDN (Content Delivery Networks) [11] and LTE (Long-Term Evolution) [12] and 5G Cellular Networks [13] heavily depend on affluent information about the state of the network. For example, author in [14] described, if VNFs (Virtual Network Functions) [15] are not aware of the traffic on virtio interfaces assisting hypervisor, then this might result in a bottleneck in NFV infrastructure. In other words, VNFs should know the state of the network (in terms of traffic) to accelerate applications hosted across VNFs in NFV infrastrucutre. Authors in [16] explained the need of the data storage as the number of connected IoT devices are increasing on unprecedented level [17]. In order to optimize the data storage, it is imperative for IoT nodes to know about the other nodes and their transportation method of moving data among networks.


Provenance-Based Interpretation of Multi-Agent Information Analysis

arXiv.org Artificial Intelligence

Analytic software tools and workflows are increasing in capability, complexity, number, and scale, and the integrity of our workflows is as important as ever. Specifically, we must be able to inspect the process of analytic workflows to assess (1) confidence of the conclusions, (2) risks and biases of the operations involved, (3) sensitivity of the conclusions to sources and agents, (4) impact and pertinence of various sources and agents, and (5) diversity of the sources that support the conclusions. We present an approach that tracks agents' provenance with PROV-O in conjunction with agents' appraisals and evidence links (expressed in our novel DIVE ontology). Together, PROV-O and DIVE enable dynamic propagation of confidence and counter-factual refutation to improve human-machine trust and analytic integrity. We demonstrate representative software developed for user interaction with that provenance, and discuss key needs for organizations adopting such approaches. We demonstrate all of these assessments in a multi-agent analysis scenario, using an interactive web-based information validation UI.


Adapting a Language Model for Controlled Affective Text Generation

arXiv.org Artificial Intelligence

Human use language not just to convey information but also to express their inner feelings and mental states. In this work, we adapt the state-of-the-art language generation models to generate affective (emotional) text. We posit a model capable of generating affect-driven and topic focused sentences without losing grammatical correctness as the affect intensity increases. We propose to incorporate emotion as prior for the probabilistic state-of-the-art text generation model such as GPT-2. The model gives a user the flexibility to control the category and intensity of emotion as well as the topic of the generated text. Previous attempts at modelling fine-grained emotions fall out on grammatical correctness at extreme intensities, but our model is resilient to this and delivers robust results at all intensities. We conduct automated evaluations and human studies to test the performance of our model, and provide a detailed comparison of the results with other models. In all evaluations, our model outperforms existing affective text generation models.


The tech secret weapon that sent customer-service satisfaction soaring

#artificialintelligence

No matter how graciously good customer service is rendered, online reviews are more likely to cite "bad customer service," than the opposite. Problematic service will dominate reviews. People are more likely to angrily pen a disgruntled review than they are a positive one, and the latter acknowledgement can boost a business' morale and bring more people to their business. The coronavirus pandemic caused stress and anxiety among those who were overwhelmed as they faced unprecedented challenges, and sought help in navigating the technology necessitated by social distancing and isolation. Yet, there's a bright light at the end of the tunnel: Issues have been resolved by the judicious use of artificial intelligence (AI)-enabled chatbots and virtual agents, according to a new IBM report, "The value of virtual technology."