b710915795b9e9c02cf10d6d2bdb688c-Paper.pdf

Neural Information Processing Systems 

The most well-known work in the reward shaping domain is the potential-based reward shaping (PBRS) method [12], which is the first to show that policy invariance can be guaranteed if the shaping reward function is in the form of the difference of potential values.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found