Dear all reviewers: Thank you very much for taking your time to review our paper and providing us valuable and

Neural Information Processing Systems 

Below, we answer all of the questions. Q1) It wasn't totally clear to me how you ensure Also, is T initialized to zero matrix, or randomly? About T, we initialize it to be a zero matrix in our experiments. R1 Q2) Can this approach take advantage of a small clean set? If a small clean set is available, it is helpful.