AITopics

1908.04734

Country: North America > United States (0.28)

Genre: Research Report (0.63)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Reinforcement Learning Applications

Li, Yuxi

We start with a brief introduction to reinforcement learning (RL), about its successful stories, basics, an example, issues, the ICML 2019 Workshop on RL for Real Life, how to use it, study material and an outlook. Then we discuss a selection of RL applications, including recommender systems, computer systems, energy, finance, healthcare, robotics, and transportation.

computer game, deep learning, renewable energy, (24 more...)

1908.06973

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (1.00)
Energy > Power Industry (1.00)
Banking & Finance > Trading (0.93)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Machine LearningAug-19-2019

Learning to Advertise for Organic Traffic Maximization in E-Commerce Product Feeds

Chen, Dagui, Jin, Junqi, Zhang, Weinan, Pan, Fei, Niu, Lvyin, Yu, Chuan, Wang, Jun, Li, Han, Xu, Jian, Gai, Kun

Most e-commerce product feeds provide blended results of advertised products and recommended products to consumers. The underlying advertising and recommendation platforms share similar if not exactly the same set of candidate products. Consumers' behaviors on the advertised results constitute part of the recommendation model's training data and therefore can influence the recommended results. We refer to this process as Leverage. Considering this mechanism, we propose a novel perspective that advertisers can strategically bid through the advertising platform to optimize their recommended organic traffic. By analyzing the real-world data, we first explain the principles of Leverage mechanism, i.e., the dynamic models of Leverage. Then we introduce a novel Leverage optimization problem and formulate it with a Markov Decision Process. To deal with the sample complexity challenge in model-free reinforcement learning, we propose a novel Hybrid Training Leverage Bidding (HTLB) algorithm which combines the real-world samples and the emulator-generated samples to boost the learning speed and stability. Our offline experiments as well as the results from the online deployment demonstrate the superior performance of our approach.

machine learning, reinforcement learning, tranullc, (18 more...)

arXiv.org Machine Learning

1908.06698

Country: North America > United States (0.15)

Genre: Research Report (0.50)

Industry:

Marketing (1.00)
Information Technology > Services > e-Commerce Services (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Zhu, Yongli, Liu, Chengxi

Mitigating Multi-Stage Cascading Failure by Reinforcement Learning

arXiv.org Machine LearningAug-19-2019

This paper proposes a cascading failure mitigation strategy based on Reinforcement Learning (RL) method. Firstly, the principles of RL are introduced. Then, the Multi-Stage Cascading Failure (MSCF) problem is presented and its challenges are investigated. The problem is then tackled by the RL based on DC-OPF (Optimal Power Flow). Designs of the key elements of the RL framework (rewards, states, etc.) are also discussed in detail. Experiments on the IEEE 118-bus system by both shallow and deep neural networks demonstrate promising results in terms of reduced system collapse rates.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1908.06599

Country:

North America (0.28)
Europe > Denmark (0.14)

Genre: Research Report (0.50)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Aubret, Arthur, Matignon, Laetitia, Hassas, Salima

A survey on intrinsic motivation in reinforcement learning

Despite numerous research work in reinforcement learning (RL) and the recent successes obtained by combining it with deep learning, deep reinforcement learning (DRL) is still facing many challenges. Some of them, like the ability to abstract actions or the difficulty to explore the environment with sparse rewards, can be addressed by the use of intrinsic motivation. In this article, we provide a survey on the role of intrinsic motivation in DRL. We categorize the different kinds of intrinsic motivations and detail their interests and limitations. Our investigation shows that the combination of DRL and intrinsic motivation enables to learn more complicated and more generalisable behaviours than standard DRL. We provide an in-depth analysis describing learning modules through an unifying scheme composed of information theory, compression theory and reinforcement learning. We then explain how these modules could serve as building blocks over a complete developmental architecture, highlighting the numerous outlooks of the domain.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1908.06976

Country: North America > United States > California (0.67)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Games > Computer Games (0.67)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Moghadam, Mahshid Helali, Saadatmand, Mehrdad, Borg, Markus, Bohlin, Markus, Lisper, Björn

An Autonomous Performance Testing Framework using Self-Adaptive Fuzzy Reinforcement Learning

Test automation can result in reduction in cost and human effort. If the optimal policy, the course of actio ns taken, for the intended objective in a testing process could be learnt by the testing system (e.g., a smart tester agent), then it could be reused in similar situations, thus leading to higher efficiency, i.e., less computational time. Automating stress testing to find performance breaking points remains a challenge for complex software systems. Common approaches are mainly based on source code or system model analysis or use - case based techniques. However, source code or system models might not be avai lable at testing time. In this paper, we propose a self - adaptive fuzzy reinforcement learning - based performance (stress) testing framework (SaFReL) that enables the tester agent to learn the optimal policy for generating stress test case s leading to performance breaking point without access to performance model of the system under test. SaFReL learns the optimal policy through an initial learning, then reuses it during a transfer learning phase, while keeping the learning running in the long - term. Through multiple experiments on a simulated environment, we demonstrate that our approach generates the stress test case s for different programs efficiently and adaptively without access to performance models .

machine learning, reinforcement learning, safrel, (18 more...)

1908.069

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Shin, Hyo-Sang, He, Shaoming, Tsourdos, Antonios

Computational Flight Control: A Domain-Knowledge-Aided Deep Reinforcement Learning Approach

This papers aims to examine the potential of using the emerging deep reinforcement learning techniques in flight control. Instead of learning from scratch, the autopilot structure is fixed as typical three-loop autopilot and deep reinforcement learning is utilised to learn the autopilot gains. This domain-knowledge-aided approach is proved to significantly improve the learning efficiency. To solve the flight control problem, we then formulate a Markovian decision process with a proper reward function that enable the application of reinforcement learning theory. The state-of-the-art deep deterministic policy gradient algorithm is utilised to learn an action policy that maps the observed states to the autopilot gains. Extensive empirical numerical simulations are performed to validate the proposed computational control algorithm.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1908.06884

Genre: Research Report (1.00)

Industry:

Transportation > Air (0.48)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Ammanabrolu, Prithviraj, Riedl, Mark O.

Transfer in Deep Reinforcement Learning using Knowledge Graphs

arXiv.org Artificial IntelligenceAug-18-2019

Text adventure games, in which players must make sense of the world through text descriptions and declare actions through text descriptions, provide a stepping stone toward grounding action in language. Prior work has demonstrated that using a knowledge graph as a state representation and question-answering to pre-train a deep Q-network facilitates faster control policy transfer. In this paper, we explore the use of knowledge graphs as a representation for domain knowledge transfer for training text-adventure playing reinforcement learning agents. Our methods are tested across multiple computer generated and human authored games, varying in domain and complexity, and demonstrate that our transfer learning methods let us learn a higher-quality control policy faster.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1908.06556

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Siriwardhana, Shamane, Weerasakera, Rivindu, Matthies, Denys J. C., Nanayakkara, Suranga

VUSFA:Variational Universal Successor Features Approximator to Improve Transfer DRL for Target Driven Visual Navigation

arXiv.org Artificial IntelligenceAug-18-2019

In this paper, we show how novel transfer reinforcement learning techniques can be applied to the complex task of target driven navigation using the photorealistic AI2THOR simulator. Specifically, we build on the concept of Universal Successor Features with an A3C agent. We introduce the novel architectural contribution of a Successor Feature Dependant Policy (SFDP) and adopt the concept of Variational Information Bottlenecks to achieve state of the art performance. VUSFA, our final architecture, is a straightforward approach that can be implemented using our open source repository. Our approach is generalizable, showed greater stability in training, and outperformed recent approaches in terms of transfer learning ability.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

1908.06376

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceAug-16-2019

Competitive Multi-Agent Deep Reinforcement Learning with Counterfactual Thinking

Wang, Yue, Wan, Yao, Zhang, Chenwei, Cui, Lixin, Bai, Lu, Yu, Philip S.

Counterfactual thinking describes a psychological phenomenon that people re-infer the possible results with different solutions about things that have already happened. It helps people to gain more experience from mistakes and thus to perform better in similar future tasks. This paper investigates the counterfactual thinking for agents to find optimal decision-making strategies in multi-agent reinforcement learning environments. In particular, we propose a multi-agent deep reinforcement learning model with a structure which mimics the human-psychological counterfactual thinking process to improve the competitive abilities for agents. To this end, our model generates several possible actions (intent actions) with a parallel policy structure and estimates the rewards and regrets for these intent actions based on its current understanding of the environment. Our model incorporates a scenario-based framework to link the estimated regrets with its inner policies. During the iterations, our model updates the parallel policies and the corresponding scenario-based regrets for agents simultaneously. To verify the effectiveness of our proposed model, we conduct extensive experiments on two different environments with real-world applications. Experimental results show that counterfactual thinking can actually benefit the agents to obtain more accumulative rewards from the environments with fair information by comparing to their opponents while keeping high performing efficiency.

agent, counterfactual, reinforcement, (14 more...)

1908.04573

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Beijing > Beijing (0.04)
South America > Brazil (0.04)
(15 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)