Goto

Collaborating Authors

 Chung, Sae-Young


Test-Time Adaptation via Self-Training with Nearest Neighbor Information

arXiv.org Artificial Intelligence

Test-time adaptation (TTA) aims to adapt a trained classifier using online unlabeled test data only, without any information related to the training procedure. Most existing TTA methods adapt the trained classifier using the classifier's prediction on the test data as pseudo-label. However, under test-time domain shift, accuracy of the pseudo labels cannot be guaranteed, and thus the TTA methods often encounter performance degradation at the adapted classifier. To overcome this limitation, we propose a novel test-time adaptation method, called Test-time Adaptation via Self-Training with nearest neighbor information (TAST), which is composed of the following procedures: (1) adds trainable adaptation modules on top of the trained feature extractor; (2) newly defines a pseudo-label distribution for the test data by using the nearest neighbor information; (3) trains these modules only a few times during test time to match the nearest neighbor-based pseudo label distribution and a prototype-based class distribution for the test data; and (4) predicts the label of test data using the average predicted class distribution from these modules. The pseudo-label generation is based on the basic intuition that a test data and its nearest neighbor in the embedding space are likely to share the same label under the domain shift. By utilizing multiple randomly initialized adaptation modules, TAST extracts useful information for the classification of the test data under the domain shift, using the nearest neighbor information. TAST showed better performance than the state-of-the-art TTA methods on two standard benchmark tasks, domain generalization, namely VLCS, PACS, OfficeHome, and TerraIncognita, and image corruption, particularly CIFAR-10/100C.


Few-Example Clustering via Contrastive Learning

arXiv.org Artificial Intelligence

We propose Few-Example Clustering (FEC), a In this paper, we propose Few-Example Clustering (FEC), a novel algorithm that performs contrastive learning novel clustering algorithm based on the hypothesis that the to cluster few examples. Our method is composed contrastive learner with the ground-truth cluster assignment of the following three steps: (1) generation of candidate is trained faster than the others. This hypothesis is built on cluster assignments, (2) contrastive learning the phenomenon that deep neural networks initially learn for each cluster assignment, and (3) selection patterns from the training examples. FEC is composed of of the best candidate. Based on the hypothesis the following three steps (see Figure 1): (1) generation of that the contrastive learner with the ground-truth candidate cluster assignments, (2) contrastive learning for cluster assignment is trained faster than the others, each cluster assignment, and (3) selection of the best candidate.


Robust Training with Ensemble Consensus

arXiv.org Machine Learning

A BSTRACT Since deep neural networks are over-parametrized, they may memorize noisy examples. We address such memorizing issue under the existence of annotation noise. From the fact that deep neural networks cannot generalize neighborhoods of the features acquired via memorization, we find that noisy examples do not consistently incur small losses on the network in the presence of perturbation. Based on this, we propose a novel training method called Learning with Ensemble Consensus (LEC) whose goal is to prevent overfitting noisy examples by eliminating them identified via consensus of an ensemble of perturbed networks. One of the proposed LECs, L TEC outperforms the current state-of-the-art methods on MNIST, CIFAR-10, and CIFAR-100 despite its efficient memory usage. 1 I NTRODUCTION Deep neural networks (DNNs) have shown excellent performance (Krizhevsky et al., 2012; He et al., 2016) on visual recognition datasets (Deng et al., 2009). However, it is difficult to obtain annotated datasets of such high quality in practice (Wang et al., 2018a). Even worse, DNNs may not generalize training data in the presence of noisy examples (Zhang et al., 2016). Therefore, there is an increasing demand for robust training methods. In general, DNNs trained on noisy datasets first generalize clean examples (Arpit et al., 2017).


Fourier Phase Retrieval with Extended Support Estimation via Deep Neural Network

arXiv.org Machine Learning

To improve the reconstruction performance of x, we exploit extended support estimate E of size larger than k satisfying E T . We propose a learning method for the deep neural network to provide E as an union of equivalent solutions of T by utilizing modulo Fourier invariances and suggest a searching technique for T by iteratively sampling E from the trained network output and applying the hard thresholding to E. Numerical results show that our proposed scheme has a superior performance with a lower complexity compared to the local search-based greedy sparse phase retrieval method and a state-of-the-art variant of the Fienup method. Index Terms deep neural network, extended support estimation, Fourier transform, sparse phase retrieval. I. INTRODUCTION Sparse phase retrieval from the magnitude of the Fourier transform (SPRF) [1], [2] has been widely studied in many fields including X-ray crystallography [3], optics [4], [5], and computational biology [6].


Tree Search Network for Sparse Regression

arXiv.org Machine Learning

We consider the classical sparse regression problem of recovering a sparse signal $x_0$ given a measurement vector $y = \Phi x_0+w$. We propose a tree search algorithm driven by the deep neural network for sparse regression (TSN). TSN improves the signal reconstruction performance of the deep neural network designed for sparse regression by performing a tree search with pruning. It is observed in both noiseless and noisy cases, TSN recovers synthetic and real signals with lower complexity than a conventional tree search and is superior to existing algorithms by a large margin for various types of the sensing matrix $\Phi$, widely used in sparse regression.


Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

arXiv.org Machine Learning

We propose Episodic Backward Update - a new algorithm to boost the performance of a deep reinforcement learning agent by a fast reward propagation. In contrast to the conventional use of the experience replay with uniform random sampling, our agent samples a whole episode and successively propagates the value of a state to its previous states. Our computationally efficient recursive algorithm allows sparse and delayed rewards to propagate efficiently through all transitions of a sampled episode. We evaluate our algorithm on 2D MNIST Maze environment and 49 games of the Atari 2600 environment and show that our method improves sample efficiency with a competitive amount of computational cost.