trajectory function
Review for NeurIPS paper: Instance-based Generalization in Reinforcement Learning
Weaknesses: The paper lacks many intricate details that prevents the reader to judge the novelty and full contribution of the work. After reading the rebuttal, an overview of the proposed solution and the problem setting would be of much help to the readers. Is the entire game (with all levels) considered as a POMDP? I see sentences such as "Line 62: environment is considered as a markov process". How is the generalization problem being modelled?