AITopics

1911.13288

Country:

Asia > China (0.46)
Europe (0.46)
Asia > Middle East > Republic of Türkiye (0.46)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)
Research Report > Promising Solution (0.68)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)
Banking & Finance > Economy (1.00)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

arXiv.org Artificial IntelligenceNov-28-2019

Option-critic in cooperative multi-agent systems

Chakravorty, Jhelum, Ward, Nadeem, Roy, Julien, Chevalier-Boisvert, Maxime, Basu, Sumana, Lupu, Andrei, Precup, Doina

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems using the options framework (Sutton et al, 1999) and provide a model-free algorithm for this problem. First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a common information approach. We use common beliefs and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, motivated by the work of Bacon et al (2017) in the single-agent setting. Our approach uses centralized option evaluation and decentralized intra-option improvement. We analyze theoretically the asymptotic convergence of DOC and validate its performance in grid-world environments, where we implement DOC using a deep neural network. Our experiments show that DOC performs competitively with state-of-the-art algorithms and that it is scalable when the number of agents increases.

agent, dec-pomdp, information, (14 more...)

1911.12825

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.91)
(2 more...)

Jain, Vishal, Fedus, William, Larochelle, Hugo, Precup, Doina, Bellemare, Marc G.

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction

arXiv.org Artificial IntelligenceNov-27-2019

Text-based games are a natural challenge domain for deep reinforcement learning algorithms. Their state and action spaces are combinatorially large, their reward function is sparse, and they are partially observable: the agent is informed of the consequences of its actions through textual feedback. In this paper we emphasize this latter point and consider the design of a deep reinforcement learning agent that can play from feedback alone. Our design recognizes and takes advantage of the structural characteristics of text-based games. We first propose a contextualisation mechanism, based on accumulated reward, which simplifies the learning problem and mitigates partial observability. We then study different methods that rely on the notion that most actions are ineffectual in any given situation, following Zahavy et al.'s idea of an admissible action. We evaluate these techniques in a series of text-based games of increasing difficulty based on the TextWorld framework, as well as the iconic game Zork. Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games.

agent, score contextualisation, subtask, (12 more...)

1911.12511

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ak, Serkan, Bruggenwirth, Stefan

Avoiding Jammers: A Reinforcement Learning Approach

arXiv.org Machine LearningNov-27-2019

This paper investigates the anti-jamming performance of a cognitive radar under a partially observable Markov decision process (POMDP) model. First, we obtain an explicit expression for uncertainty of jammer dynamics, which paves the way for illuminating the performance metric of probability of being jammed for the radar beyond a conventional signal-to-noise ratio ($\mathsf{SNR}$) based analysis. Considering two frequency hopping strategies developed in the framework of reinforcement learning (RL), this performance metric is analyzed with deep Q-network (DQN) and long short term memory (LSTM) networks under various uncertainty values. Finally, the requirement of the target network in the RL algorithm for both network architectures is replaced with a softmax operator. Simulation results show that this operator improves upon the performance of the traditional target network.

jammer, probability, radar, (17 more...)

1911.08874

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Eghtesad, Taha, Vorobeychik, Yevgeniy, Laszka, Aron

Deep Reinforcement Learning based Adaptive Moving Target Defense

arXiv.org Artificial IntelligenceNov-27-2019

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary's uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary's observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions. In this paper, we propose finding optimal MTD strategies using deep reinforcement learning. Based on an established model of adaptive MTD, we formulate finding an MTD strategy as finding a policy for a partially-observable Markov decision process. To significantly improve training performance, we introduce compact memory representations. To demonstrate our approach, we provide thorough numerical results, showing significant improvement over existing strategies.

adversary, defender, server, (12 more...)

1911.11972

Country: North America > United States > Texas > Harris County > Houston (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

#artificialintelligenceNov-26-2019, 11:19:46 GMT

Workshop IV: Using Physical Insights for Machine Learning

In this workshop we will explore how to use physical intuition and ideas to design new classes of machine learning (ML) algorithms. Physics-inspired sampling algorithms could be used to train ML structures or sample the hyper-parameter space (e.g. Additionally, physics-based models such as Ising/Potts models or energy-based models have influenced ML inference frameworks such as Markov Random Fields and Restricted Boltzmann Machines, and we want to continue the discussion to facilitate this innovation transfer. Finally, physical insight could be used to enhance learning in the situation of scarce data by enforcing smoothness, differentiability or other physical properties relevant to a given problem. We will also explore the use of Koopmans' theorem to design learning algorithms for dynamical systems.

machine learning, physical insight, workshop iv, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.65)

arXiv.org Machine LearningNov-26-2019

Approximating the Permanent by Sampling from Adaptive Partitions

Kuck, Jonathan, Dao, Tri, Rezatofighi, Hamid, Sabharwal, Ashish, Ermon, Stefano

Computing the permanent of a non-negative matrix is a core problem with practical applications ranging from target tracking to statistical thermodynamics. However, this problem is also #P-complete, which leaves little hope for finding an exact solution that can be computed efficiently. While the problem admits a fully polynomial randomized approximation scheme, this method has seen little use because it is both inefficient in practice and difficult to implement. We present AdaPart, a simple and efficient method for drawing exact samples from an unnormalized distribution. Using AdaPart, we show how to construct tight bounds on the permanent which hold with high probability, with guaranteed polynomial runtime for dense matrices. We find that AdaPart can provide empirical speedups exceeding 25x over prior sampling methods on matrices that are challenging for variational based approaches. Finally, in the context of multi-target tracking, exact sampling from the distribution defined by the matrix permanent allows us to use the optimal proposal distribution during particle filtering. Using AdaPart, we show that this leads to improved tracking performance using an order of magnitude fewer samples.

algorithm, matrix, partition, (16 more...)

1911.11856

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Verma, Harsh, Thickstun, John

Convolutional Composer Classification

arXiv.org Machine LearningNov-26-2019

The composer classification question has been posed for a variety of corpora, from Renaissance composers [2,3], to the narrow (and challenging) case of Haydn and Mozart string quartets [5, 8, 12, 22], and to various collections of classical era composers (most of the other papers discussed in Section 2). In this work we study an expansive collection of scores, from 13th century sacred music by Guillaume Du Fay to 20th century ragtimes by Scott Joplin. A major challenge of this task is learning from limited data. While the corpus considered here is larger than most, this is largely due to the number of composers considered (19): for specific composers, we have at most 466 scores (Bach) and as few as 22 (Japart). Small datasets are an inherent problem for composer classification: the corpus used in this work contains, for example, all of the Bach chorales and all of the Mozart string quartets. We cannot resurrect these composers and have them write us more scores to include in our corpus. This situation contrasts starkly with many learning problems, where substantial progress can be made by collecting massive datasets and exhaustively training an expressive model (usually a deep neural network) with "big data."

classification, composer, corpus, (14 more...)

1911.11737

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningNov-26-2019

Representation Learning: A Statistical Perspective

Xie, Jianwen, Gao, Ruiqi, Nijkamp, Erik, Zhu, Song-Chun, Wu, Ying Nian

Learning representations of data is an important problem in statistics and machine learning. While the origin of learning representations can be traced back to factor analysis and multidimensional scaling in statistics, it has become a central theme in deep learning with important applications in computer vision and computational neuroscience. In this article, we review recent advances in learning representations from a statistical perspective. In particular, we review the following two themes: (a) unsupervised learning of vector representations and (b) learning of both vector and matrix representations.

generator model, representation, vector, (14 more...)

1911.11374

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Zhang, Kaiqing, Yang, Zhuoran, Başar, Tamer

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

arXiv.org Artificial IntelligenceNov-24-2019

Recent years have witnessed significant advances in reinforcement learning (RL), which has registered great success in solving various sequential decision-making problems in machine learning. Most of the successful RL applications, e.g., the games of Go and Poker, robotics, and autonomous driving, involve the participation of more than one single agent, which naturally fall into the realm of multi-agent RL (MARL), a domain with a relatively long history, and has recently re-emerged due to advances in single-agent RL techniques. Though empirically successful, theoretical foundations for MARL are relatively lacking in the literature. In this chapter, we provide a selective overview of MARL, with focus on algorithms backed by theoretical analysis. More specifically, we review the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two. We also introduce several significant but challenging applications of these algorithms. Orthogonal to the existing reviews on MARL, we highlight several new angles and taxonomies of MARL theory, including learning in extensive-form games, decentralized MARL with networked agents, MARL in the mean-field regime, (non-)convergence of policy-based methods for learning in games, etc. Some of the new angles extrapolate from our own research endeavors and interests. Our overall goal with this chapter is, beyond providing an assessment of the current state of the field on the mark, to identify fruitful future research directions on theoretical studies of MARL. We expect this chapter to serve as continuing stimulus for researchers interested in working on this exciting while challenging topic.

agent, algorithm, arxiv preprint arxiv, (11 more...)

1911.10635

Country:

North America > Canada > Alberta (0.14)
North America > United States > Texas (0.04)
Asia > Middle East > Jordan (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)