SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning

Nath, Somjit, Verma, Richa, Ray, Abhik, Khadilkar, Harshad

Dec-21-2020–arXiv.org Machine Learning

We propose a generic reward shaping approach for improving the Similar approaches appear to have worked in literature on container rate of convergence in reinforcement learning (RL), called Self loading [27] and railway scheduling [11] problems, without Improvement Based REwards, or SIBRE. The approach is designed being formally proposed or analysed. One study on bin packing for use in conjunction with any existing RL algorithm, and consists does propose reward shaping explicitly, and is described below. of rewarding improvement over the agent's own past performance. Literature on formal reward shaping: The proposed approach We prove that SIBRE converges in expectation under the same (SIBRE) falls under the category of reward shaping approaches conditions as the original RL algorithm. The reshaped rewards for RL, but with some key novelty points as described help discriminate between policies when the original rewards are below. Prior literature has shown that the optimal policy learnt weakly discriminated or sparse. Experiments on several well-known by RL remains invariant under reward shaping if the modification benchmark environments with different RL algorithms show that can be expressed as a potential function [15].

agent, algorithm, sibre, (16 more...)

arXiv.org Machine Learning

Dec-21-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas > Travis County
    - Austin (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Arizona > Maricopa County
    - Phoenix (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Games (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found