Reviews: Learning with Average Top-k Loss

Neural Information Processing Systems 

This paper investigates a new learning setting: optimizing the average k largest (top-k) individual functions for supervised learning. This setting is different from the standard Empirical Risk minimization (ERM), which optimize the average loss function over datasets. The proposed setting is also different from maximum loss (Shalev-Shwartz and Wexler 2016), which optimize the maximum loss. This paper tries to optimize the average top-k loss functions. This can be viewed as a natural generalization of the ERM and the maximum loss.