Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes Supplementary Materials
–Neural Information Processing Systems
All these tasks are benchmarks that are used for testing models with long-term memory. It is similar to the previous one. Each image is flattened to one dimensional vector. The agent can only move forward along the corridor. The agent receives the reward only at the end of the episode.
Neural Information Processing Systems
Aug-19-2025, 17:52:46 GMT