Unsupervised Domain Adaptation within Deep Foundation Latent Spaces

Kangin, Dmitry, Angelov, Plamen

arXiv.org Artificial Intelligence 

The vision transformer-based foundation models, such as ViT or Dino-V2, are aimed at solving problems with little or no finetuning of features. Using a setting of prototypical networks, we analyse to what extent such foundation models can solve unsupervised domain adaptation without finetuning over the source or target domain. Through quantitative analysis, as well as qualitative interpretations of decision making, we demonstrate that the suggested method can improve upon existing baselines, as well as showcase the limitations of such approach yet to be solved. With the advancement of foundation models, improvements in semi-and unsupervised learning methods can shift from end-to-end training towards decision making over the foundation models' latent spaces (Oquab et al. (2023); Angelov et al. (2023)). Below we describe the problem of unsupervised domain adaptation (UDA) (Saenko et al. (2010)).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found