start with common concerns and then respond to individual reviewer comments as space permits: 2 Common: There should be a baseline using MCTS and assuming access to simulator / common random numbers

Neural Information Processing Systems 

Thank you for the thoughtful and careful reviews. We hope the AC nominates some of you for reviewer awards. There should be a baseline using MCTS and assuming access to simulator / common random numbers. There appears to be some imprecision in reviews about what this means. Then environment stochasticity is re-sampled and the algorithm repeats.