SUPPLEMENTARY MATERIAL Deep Reinforcement Learning with Stacked Hierarchical Attention for Text based Games

Neural Information Processing Systems 

In the supplementary material, we describe the training details, examples of game interface and interactions used in the paper. We train our model using the Advantage Actor Critic (A2C) method [37] across valid actions. Function to obtain the valid action set is provided by Jericho [20]. Similar to KG-A2C [3], a supervised auxiliary task "valid action prediction" is introduced to assist RL training. You are in attendance at the annual Grue Convention, this year a rather somber affair due to the "adventurer famine" that has gripped gruedom in this isolated corner of the empire.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found