Learning Surrogate Losses
Grabocka, Josif, Scholz, Randolf, Schmidt-Thieme, Lars
The minimization of loss functions is the heart and soul of Machine Learning. In this paper, we propose an off-the-shelf optimization approach that can minimize virtually any non-differentiable and non-decomposable loss function (e.g. Miss-classification Rate, AUC, F1, Jaccard Index, Mathew Correlation Coefficient, etc.) seamlessly. Our strategy learns smooth relaxation versions of the true losses by approximating them through a surrogate neural network. The proposed loss networks are set-wise models which are invariant to the order of mini-batch instances. Ultimately, the surrogate losses are learned jointly with the prediction model via bilevel optimization. Empirical results on multiple datasets with diverse real-life loss functions compared with state-of-the-art baselines demonstrate the efficiency of learning surrogate losses.
May-24-2019
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- United States
- New York > New York County
- New York City (0.04)
- Florida > Broward County
- Fort Lauderdale (0.04)
- New York > New York County
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe
- Germany (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Floriana (0.04)
- France > Hauts-de-France
- Oceania > Australia
- Genre:
- Research Report (0.50)
- Technology: