Supplementary Material for Thought Cloning: Learning to Think while Acting by Imitating Human Thinking Anonymous Author(s) Affiliation Address email A Architecture and Training Details

Neural Information Processing Systems 

The pseudocode for Thought Cloning (TC) training framework is shown in Algorithm 1. Backpropagation Through Time was truncated at 20 steps in TC. Detailed hyperparameter settings are shown in Table 1. Figure 1 presents an example trajectory. For instance, the plan could be to "open the red door" Figure 3: Example trajectories of agents trained with different strategies. Because of the realization from being able to observe the agent's Attention is all you need.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found