Review for NeurIPS paper: Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

Neural Information Processing Systems 

Weaknesses: A few comments that are needed to be addressed: 1) The first comment is about the presentation of the derivations. There are steps in the appendix, and also in the main text that are skipped. Some of them took me a while to rederive, some I couldn't spend more time to rederive. Some steps are also taken as granted in the main text. It is useful to elaborate on them more.