Training Over-parameterized Models with Non-decomposable Objectives
Narasimhan, Harikrishna, Menon, Aditya Krishna
–arXiv.org Artificial Intelligence
Many modern machine learning applications come with complex and nuanced design goals such as minimizing the worst-case error, satisfying a given precision or recall target, or enforcing group-fairness constraints. Popular techniques for optimizing such non-decomposable objectives reduce the problem into a sequence of cost-sensitive learning tasks, each of which is then solved by re-weighting the training loss with example-specific costs. We point out that the standard approach of re-weighting the loss to incorporate label costs can produce unsatisfactory results when used to train over-parameterized models. As a remedy, we propose new cost-sensitive losses that extend the classical idea of logit adjustment to handle more general cost matrices. Our losses are calibrated, and can be further improved with distilled labels from a teacher model. Through experiments on benchmark image datasets, we showcase the effectiveness of our approach in training ResNet models with common robust and constrained optimization objectives.
arXiv.org Artificial Intelligence
Jul-9-2021
- Country:
- North America
- Canada > Ontario
- Toronto (0.14)
- United States
- California > San Francisco County
- San Francisco (0.14)
- New York (0.14)
- California > San Francisco County
- Canada > Ontario
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine (0.46)
- Technology: