AITopics

1908.08578

Country: Europe (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.62)

arXiv.org Machine LearningAug-22-2019

Transfer Learning for Relation Extraction via Relation-Gated Adversarial Learning

Zhang, Ningyu, Deng, Shumin, Sun, Zhanlin, Chen, Jiaoyan, Zhang, Wei, Chen, Huajun

Relation extraction aims to extract relational facts from sentences. Previous models mainly rely on manually labeled datasets, seed instances or human-crafted patterns, and distant supervision. However, the human annotation is expensive, while human-crafted patterns suffer from semantic drift and distant supervision samples are usually noisy. Domain adaptation methods enable leveraging labeled data from a different but related domain. However, different domains usually have various textual relation descriptions and different label space (the source label space is usually a superset of the target label space). To solve these problems, we propose a novel model of relation-gated adversarial learning for relation extraction, which extends the adversarial based domain adaptation. Experimental results have shown that the proposed approach outperforms previous domain adaptation methods regarding partial domain adaptation and can improve the accuracy of distance supervised relation extraction through fine-tuning. 1 Introduction Relation extraction (RE) is devoted to extracting relational facts from sentences, which can be applied to many natural language processing (NLP) applications such as knowledge base construction (Wu and Weld, 2010) and question answering (Dai et al., 2016).

machine learning, natural language, reinforcement learning, (17 more...)

1908.08507

Country:

Asia > Middle East (0.46)
South America > Brazil (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Kallus, Nathan, Uehara, Masatoshi

Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes

arXiv.org Artificial IntelligenceAug-22-2019

Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. We consider for the first time the semiparametric efficiency limits of OPE in Markov decision processes (MDPs), where actions, rewards, and states are memoryless. We show existing OPE estimators may fail to be efficient in this setting. We develop a new estimator based on cross-fold estimation of $q$-functions and marginalized density ratios, which we term double reinforcement learning (DRL). We show that DRL is efficient when both components are estimated at fourth-root rates and is also doubly robust when only one component is consistent. We investigate these properties empirically and demonstrate the performance benefits due to harnessing memorylessness efficiently.

artificial intelligence, efficient off-policy evaluation, machine learning, (2 more...)

1908.08526

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Toromanoff, Marin, Wirbel, Emilie, Moutarde, Fabien

Is Deep Reinforcement Learning Really Superhuman on Atari?

arXiv.org Artificial IntelligenceAug-22-2019

Consistent and reproducible evaluation of Deep Reinforcement Learning (DRL) is not straightforward. In the Arcade Learning Environment (ALE), small changes in environment parameters such as stochasticity or the maximum allowed play time can lead to very different performance. In this work, we discuss the difficulties of comparing different agents trained on ALE. In order to take a step further towards reproducible and comparable DRL, we introduce SABER, a Standardized Atari BEnchmark for general Reinforcement learning algorithms. Our methodology extends previous recommendations and contains a complete set of environment parameters as well as train and test procedures. We then use SABER to evaluate the current state of the art, Rainbow. Furthermore, we introduce a human world records baseline, and argue that previous claims of expert or superhuman performance of DRL might not be accurate. Finally, we propose Rainbow-IQN by extending Rainbow with Implicit Quantile Networks (IQN) leading to new state-of-the-art performance. Source code is available for reproducibility.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1908.04683

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (0.94)
Leisure & Entertainment > Games > Computer Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Yang, Runzhe, Sun, Xingyuan, Narasimhan, Karthik

A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

arXiv.org Artificial IntelligenceAug-21-2019

We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is to learn policies over multiple competing objectives whose relative importance (preferences) is unknown to the agent. While this alleviates dependence on scalar reward design, the expected return of a policy can change significantly with varying preferences, making it challenging to learn a single model to produce optimal policies under different preference conditions. We propose a generalized version of the Bellman equation to learn a single parametric representation for optimal policies over the space of all possible preferences. After this initial learning phase, our agent can quickly adapt to any given preference, or automatically infer an underlying preference with very few samples. Experiments across four different domains demonstrate the effectiveness of our approach.

machine learning, natural language, reinforcement learning, (18 more...)

1908.08342

Country:

Europe (1.00)
North America > Canada > Alberta (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Carpi, Fabrizio, Häger, Christian, Martalò, Marco, Raheli, Riccardo, Pfister, Henry D.

Reinforcement Learning for Channel Coding: Learned Bit-Flipping Decoding

arXiv.org Artificial IntelligenceAug-21-2019

In this paper, we use reinforcement learning to find effective decoding strategies for binary linear codes. We start by reviewing several iterative decoding algorithms that involve a decision-making process at each step, including bit-flipping (BF) decoding, residual belief propagation, and anchor decoding. We then illustrate how such algorithms can be mapped to Markov decision processes allowing for data-driven learning of optimal decision strategies, rather than basing decisions on heuristics or intuition. As a case study, we consider BF decoding for both the binary symmetric and additive white Gaussian noise channel. Our results show that learned BF decoders can offer a range of performance-complexity trade-offs for the considered Reed-Muller and BCH codes, and achieve near-optimal performance in some cases. We also demonstrate learning convergence speed-ups when biasing the learning process towards correct decoding decisions, as opposed to relying only on random explorations and past knowledge.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1906.04448

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Hu, Yang, Montana, Giovanni

Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity

arXiv.org Machine LearningAug-20-2019

Transfer learning methods for reinforcement learning (RL) domains facilitate the acquisition of new skills using previously acquired knowledge. The vast majority of existing approaches assume that the agents have the same design, e.g. same shape and action spaces. In this paper we address the problem of transferring previously acquired skills amongst morphologically different agents (MDAs). For instance, assuming that a bipedal agent has been trained to move forward, could this skill be transferred on to a one-leg hopper so as to make its training process for the same task more sample efficient? We frame this problem as one of subspace learning whereby we aim to infer latent factors representing the control mechanism that is common between MDAs. We propose a novel paired variational encoder-decoder model, PVED, that disentangles the control of MDAs into shared and agent-specific factors. The shared factors are then leveraged for skill transfer using RL. Theoretically, we derive a theorem indicating how the performance of PVED depends on the shared factors and agent morphologies. Experimentally, PVED has been extensively validated on four MuJoCo environments. We demonstrate its performance compared to a state-of-the-art approach and several ablation cases, visualize and interpret the hidden factors, and identify avenues for future improvements.

agent, subspace, target agent, (14 more...)

1908.05265

Country: North America > United States > Montana (0.04)

Genre: Research Report (1.00)

Industry:

Education (0.72)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Zhong, Chen, Lu, Ziyang, Gursoy, M. Cenk, Velipasalar, Senem

A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access

arXiv.org Machine LearningAug-20-2019

To make efficient use of limited spectral resources, we in this work propose a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. We employ the proposed framework as a single agent in the single-user case, and extend it to a decentralized multi-agent framework in the multi-user scenario. In both cases, we develop algorithms for the actor-critic deep reinforcement learning and evaluate the proposed learning policies via experiments and numerical results. In the single-user model, in order to evaluate the performance of the proposed channel access policy and the framework's tolerance against uncertainty, we explore different channel switching patterns and different switching probabilities. In the case of multiple users, we analyze the probabilities of each user accessing channels with favorable channel conditions and the probability of collision. We also address a time-varying environment to identify the adaptive ability of the proposed framework. Additionally, we provide comparisons (in terms of both the average reward and time efficiency) between the proposed actor-critic deep reinforcement learning framework, Deep-Q network (DQN) based approach, random access, and the optimal policy when the channel dynamics are known.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1908.08401

Genre: Research Report (0.81)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Wang, Chun-Chieh, Tsai, Yun-Cheng

Deep Reinforcement Learning for Foreign Exchange Trading

arXiv.org Artificial IntelligenceAug-20-2019

Reinforcement learning can interact with the environment and is suitable for applications in decision control systems. Therefore, we used the reinforcement learning method to establish a foreign exchange transaction, avoiding the long-standing problem of unstable trends in deep learning predictions. In the system design, we optimized the Sure-Fire statistical arbitrage policy, set three different actions, encoded the continuous price over a period of time into a heat-map view of the Gramian Angular Field (GAF) and compared the Deep Q Learning (DQN) and Proximal Policy Optimization (PPO) algorithms. To test feasibility, we analyzed three currency pairs, namely EUR/USD, GBP/USD, and AUD/USD. We trained the data in units of four hours from 1 August 2018 to 30 November 2018 and tested model performance using data between 1 December 2018 and 31 December 2018. The test results of the various models indicated that favorable investment performance was achieved as long as the model was able to handle complex and random processes and the state was able to describe the environment, validating the feasibility of reinforcement learning in the development of trading strategies.

machine learning, reinforcement, reinforcement learning, (18 more...)

1908.08036

Country: Asia > Taiwan (0.15)

Genre: Research Report (0.65)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gonzalez-Soto, Mauricio, Espina, Felipe Orihuela

Reinforcement Learning is not a Causal problem

arXiv.org Artificial IntelligenceAug-20-2019

We use an analogy between non-isomorphic mathematical structures defined over the same set and the algebras induced by associative and causal levels of information in order to argue that Reinforcement Learning, in its current formulation, is not a causal problem, independently if the motivation behind it has to do with an agent taking actions.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1908.07617

Country:

North America (0.16)
Europe > United Kingdom > England (0.16)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)