Shin, Juhyeon
SF(DA)$^2$: Source-free Domain Adaptation Through the Lens of Data Augmentation
Hwang, Uiwon, Lee, Jonghyun, Shin, Juhyeon, Yoon, Sungroh
In the face of the deep learning model's vulnerability to domain shift, source-free domain adaptation (SFDA) methods have been proposed to adapt models to new, unseen target domains without requiring access to source domain data. Although the potential benefits of applying data augmentation to SFDA are attractive, several challenges arise such as the dependence on prior knowledge of class-preserving transformations and the increase in memory and computational requirements. We construct an augmentation graph in the feature space of the pretrained model using the neighbor relationships between target features and propose spectral neighborhood clustering to identify partitions in the prediction space. Furthermore, we propose implicit feature augmentation and feature disentanglement as regularization loss functions that effectively utilize class semantic information within the feature space. These regularizers simulate the inclusion of an unlimited number of augmented target features into the augmentation graph while minimizing computational and memory demands. Our method shows superior adaptation performance in SFDA scenarios, including 2D image and 3D point cloud datasets and a highly imbalanced dataset. In recent years, deep learning has achieved significant advancements and is widely explored for realworld applications. However, the performance of deep learning models can significantly deteriorate when deployed on unlabeled target domains, which differ from the source domain where the training data was collected. This domain shift poses a challenge for applying deep learning models in practical scenarios.
Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors
Lee, Jonghyun, Jung, Dahuin, Lee, Saehyung, Park, Junsung, Shin, Juhyeon, Hwang, Uiwon, Yoon, Sungroh
The primary challenge of TTA is limited access to the entire test dataset during online updates, causing error accumulation. To mitigate it, TTA methods have utilized the model output's entropy as a confidence metric that aims to determine which samples have a lower likelihood of causing error. Through experimental studies, however, we observed the unreliability of entropy as a confidence metric for TTA under biased scenarios and theoretically revealed that it stems from the neglect of the influence of latent disentangled factors of data on predictions. Building upon these findings, we introduce a novel TTA method named Destroy Your Object (DeYO), which leverages a newly proposed confidence metric named Pseudo-Label Probability Difference (PLPD). PLPD quantifies the influence of the shape of an object on prediction by measuring the difference between predictions before and after applying an object-destructive transformation. DeYO consists of sample selection and sample weighting, which employ entropy and PLPD concurrently. For robust adaptation, DeYO prioritizes samples that dominantly incorporate shape information when making predictions. Our extensive experiments demonstrate the consistent superiority of DeYO over baseline methods across various scenarios, including biased and wild. Although deep neural networks (DNNs) demonstrate powerful performance across various domains, they lack robustness against distribution shifts under conventional training (He et al., 2016; Pan & Yang, 2009). Therefore, research areas such as domain generalization (Blanchard et al., 2011; Gulrajani & Lopez-Paz, 2021), which involves training models to be robust against arbitrary distribution shifts, and unsupervised domain adaptation (UDA) (Ganin & Lempitsky, 2015; Park et al., 2020), which seeks domain-invariant information for label-absent target domains, have been extensively investigated in the existing literature. Test-time adaptation (TTA) (Wang et al., 2021a) has also gained significant attention as a means to address distribution shifts occurring during test time. TTA leverages each data point once for adaptation immediately after inference. Its minimal overhead compared to existing areas makes it particularly suitable for real-world applications (Azimi et al., 2022). Because UDA assumes access to the entire test samples before adaptation, it utilizes its information on a task by analyzing the distribution of the entire test set (Kang et al., 2019). It leads to inaccurate predictions, and incorporating them into model updates results in error accumulation within the model (Arazo et al., 2020).
Gradient Alignment with Prototype Feature for Fully Test-time Adaptation
Shin, Juhyeon, Lee, Jonghyun, Lee, Saehyung, Park, Minjun, Lee, Dongjun, Hwang, Uiwon, Yoon, Sungroh
TTA guidance from entropy minimization focuses on adapting a model during the inference phase, using loss from misclassified pseudo label. We developed only the test data that is streamed online, without access to a gradient alignment loss to precisely manage the training data or test labels. Common strategies employed in adaptation process, ensuring that changes made for TTA include objectives like entropy minimization [Wang et al., some data don't negatively impact the model's performance 2021] or cross-entropy with pseudo-labels [Goyal et al., 2022], on other data. We introduce a prototype designed to guide the model's self-supervision. However, feature of a class as a proxy measure of the negative these methods are susceptible to confirmation bias [Arazo et impact. To make GAP regularizer feasible under al., 2020], where data with noisy predictions can lead the the TTA constraints, where model can only access model to continually learn in the wrong direction.