Exploring the Design of Adaptation Protocols for Improved Generalization and Machine Learning Safety

Trivedi, Puja, Koutra, Danai, Thiagarajan, Jayaraman J.

arXiv.org Artificial Intelligence 

While directly While directly fine-tuning (FT) large-scale, pretrained fine-tuning (FT) such models on task-specific data models on task-specific data is wellknown is known to improve in-distribution (ID) task performance to induce strong in-distribution task performance, (Neyshabur et al., 2020; Zhuang et al., 2019; Chen et al., recent works have demonstrated that different 2020), recent work finds FT does not effectively leverage the adaptation protocols, such as linear probing expressiveness of large-scale, pretrained representations and (LP) prior to FT, can improve out-of-distribution fails to match the out-of-distribution (OOD) performance of generalization. However, the design space of such other adaptation protocols, such as the LP + FT protocol adaptation protocols remains under-explored and which performs linear probing (LP) prior to FT (Kumar the evaluation of such protocols has primarily focused et al., 2022). Concurrently, Kirichenko et al. (2022) find on distribution shifts. Therefore, in this that simply retraining the last (classifier) layer with a small work, we evaluate common adaptation protocols amount of "re-weighting" or minority group data, can safeguard across distributions shifts and machine learning against spurious correlations. Crucially, both works safety metrics (e.g., anomaly detection, calibration, suggest that well-designed adaptation protocols can improve robustness to corruptions). We find that protocols both ID task performance and robustness.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found