Reviews: Multi-step learning and underlying structure in statistical models

Neural Information Processing Systems 

In the end (from the proof of Theorem 3) I conclude that in all cases f_p(x) P(Y 1 X x) . This should be stated explicitly!