COME: Test-time adaption by Conservatively Minimizing Entropy

Zhang, Qingyang, Bian, Yatao, Kong, Xinke, Zhao, Peilin, Zhang, Changqing

arXiv.org Machine Learning 

As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to Conservatively Minimize the Entropy (COME), which is a simple drop-in replacement of traditional EM to elegantly address the limitation. By doing so, COME naturally regularizes the model to favor conservative confidence on unreliable samples. Theoretically, we provide a preliminary analysis to reveal the ability of COME in enhancing the optimization stability by introducing a data-adaptive lower bound on the entropy. Empirically, our method achieves state-of-the-art performance on commonly used benchmarks, showing significant improvements in terms of classification accuracy and uncertainty estimation under various settings including standard, life-long and open-world TTA, i.e., up to 34.5% improvement on accuracy and 15.1% on false positive rate. Endowing machine learning models with self-adjust ability is essential for their deployment in the open world, such as autonomous vehicle control and embodied AI systems. To this end, test-time adaption (TTA) emerges as a promising strategy to enhance the performance in the open world which often encounters unexpected noise or corruption (e.g., data from rainy or snowy weather). Unsupervised losses play a crucial role in model adaptation, which can improve the accuracy of a model on novel distributional test data without the need for additional labeled training data. The initial intuition behind using entropy minimization, given by (Wang et al., 2021) is based on the observation that models tend to be more accurate on samples for which they make predictions with higher confidence. The natural extension of this observation is to encourage models to bolster the confidence on test samples.