Effects of sparse rewards of different magnitudes in the speed of learning of model-based actor critic methods

Open in new window