Entropic Desired Dynamics for Intrinsic Control: Supplemental Material Steven Hansen

Neural Information Processing Systems 

While this is not close to the state-of-the-art in general (c.f. Figure 2 shows the effect of action entropy on exploratory behavior in Montezuma's Revenge. Number of unique avatar positions visited. Full training curves across all 6 Atari games are shown in Figure 1, including the random policy baseline. To ensure this didn't hamper performance, we At each state visited by the agent evaluator during training, the agent's state (consisting of the avatar's The full curves are included for completeness. The compute cluster we performed experiments on is heterogenous, and has features such as host-sharing, adaptive load-balancing, etc.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found