Reviewer 1 - Use of mini-batches: in our experiments, we indeed use mini-batches of size B, by sampling B points

Neural Information Processing Systems 

We would like to thank all reviewers for their valuable feedback and comments. Please find our responses below. This is because it predicts an almost uniform distribution. AdaCV aR also has a lower CV aR than ERM (standard SGD). Thank you for observing that.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found