We thank reviewers for positive feedback, mentioning DTSIL as an effective novel method (R2,3,4) for a significant

Neural Information Processing Systems 

We will incorporate the suggestions. More details were provided in Appendix B.1, especially We will add these pointers and more descriptions in main text to clarify our algorithm. We will make the connection between DTSIL and prior works more clear, especially for imitation learning part. Pseudocode for organizing clusters was in Appendix A.3. DTSIL+EXP without SL performs worse on Montezuma's Revenge Assume agent's location in state embeddings is normalized to We will add this comparison and more discussions about off-policy and model-based exploration methods.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found