Selective Mixup Fine-Tuning for Optimizing Non-Decomposable Objectives

Ramasubramanian, Shrinivas, Rangwani, Harsh, Takemori, Sho, Samanta, Kunal, Umeda, Yuhei, Radhakrishnan, Venkatesh Babu

arXiv.org Machine Learning 

The rise in internet usage has led to the generation of massive amounts of data, resulting in the adoption of various supervised and semi-supervised machine learning algorithms, which can effectively utilize the colossal amount of data to train models. However, before deploying these models in the real world, these must be strictly evaluated on performance measures like worst-case recall and satisfy constraints such as fairness. We find that current state-of-the-art empirical techniques offer sub-optimal performance on these practical, non-decomposable performance objectives. To bridge the gap, we propose SelMix, a selective mixup-based inexpensive fine-tuning technique for pre-trained models, to optimize for the desired objective. The core idea of our framework is to determine a sampling distribution to perform a mixup of features between samples from particular classes such that it optimizes the given objective. We comprehensively evaluate our technique against the existing empirical and theoretically principled methods on standard benchmark datasets for imbalanced classification. We find that proposed SelMix fine-tuning significantly improves the performance for various practical non-decomposable objectives across benchmarks. The rise of deep networks has shown great promise by reaching near-perfect performance across computer vision tasks (He et al., 2022; Kolesnikov et al., 2020; Kirillov et al., 2023; Girdhar et al., 2023). It has led to their widespread deployment for practical applications, some of which have critical consequences (Castelvecchi, 2020). Hence, these deployed models must perform robustly across the entire data distribution and not just the majority part. These failure cases are often overlooked when considering only accuracy as our primary performance metric. Therefore, more practical metrics like Recall H-Mean (Sun et al., 2006), Worst-Case (Min) Recall (Narasimhan & Menon, 2021; Mohri et al., 2019), etc., should be used for evaluation.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found