coda data
[R2, R3] Amount of augmented data and sample efficiency
R3 asked why more CoDA samples don't always increase performance. This is all we meant by Remark 3.1: that within We agree our "intuitive" explanation of minimality might mislead in the way RL/Causal literatures, we show a broad application of causal techniques yielding empirical sample efficiency in RL. The "mental ignorance" comment at the end of Remark B's actual thoughts (but not agent A's belief about agent B's thoughts), and other true facts that agent A is ignorant of. We did not try the delta state trick; this is a helpful suggestion (thanks!) that we Note that CoDA + MBPO were complementary in the Batch RL case.