Review for NeurIPS paper: Learnability with Indirect Supervision Signals

Neural Information Processing Systems 

Weaknesses: 1. Are there lower bounds to match the main upper bound for the main result, Thm. 4.2, Eq. 2? If there was only a single T which was the identity then "no" because you can get the faster rate from realizable PAC learning. How would this need to be generalized to have matching upper and lower bounds? At least a discussion of the matter would help make the limitations of the present work more clear. E.g. if there are only 2 indirect labels and 4 real labels, show a concrete example of learning the true labelling funciton for the more complex case.... Maybe just have linear 1-D thresholds with fixed distances between the transitions discontinuity in classes, with noisy subsetting. Both working through this analytically, to give intuition for why we can learn more labels with less labels is possible, and some empirical results would be useful.