Appendix A Preliminaries 20 A.1 Notation 20 A.2 Assumptions 21 A.3 Definitions

Neural Information Processing Systems 

Then a predictor is an ERM if and only if it predicts the most likely label for all examples in the dataset.