Appendix

Neural Information Processing Systems 

We define a supervised task as the optimization problem associated with the tuple of dataset distribution p(x, y) and loss function L(p, f): (p, f) 7! R