AITopics | tta

Frustratingly Easy Test-Time Adaptation of Vision-Language Models

Neural Information Processing SystemsMar-22-2026, 19:09:03 GMT

Vision-Language Models seamlessly discriminate among arbitrary semantic categories, yet they still suffer from poor generalization when presented with challenging examples. For this reason, Episodic Test-Time Adaptation (TTA) strategies have recently emerged as powerful techniques to adapt VLMs in the presence of a single unlabeled image. The recent literature on TTA is dominated by the paradigm of prompt tuning by Marginal Entropy Minimization, which, relying on online backpropagation, inevitably slows down inference while increasing memory. In this work, we theoretically investigate the properties of this approach and unveil that a surprisingly strong TTA method lies dormant and hidden within it. We term this approach ZERO (TTA with "zero" temperature), whose design is both incredibly effective and frustratingly simple: augment N times, predict, retain the most confident predictions, and marginalize after setting the Softmax temperature to zero. Remarkably, ZERO requires a single batched forward pass through the vision encoder only and no backward passes. We thoroughly evaluate our approach following the experimental protocol established in the literature and show that ZERO largely surpasses or compares favorably w.r.t. the state-of-the-art while being almost 10 faster and 13 more memory friendly than standard Test-Time Prompt Tuning. Thanks to its simplicity and comparatively negligible computation, ZERO can serve as a strong baseline for future work in this field.

artificial intelligence, machine learning, natural language, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Natural Language (0.62)

Add feedback

Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line

Neural Information Processing SystemsMar-22-2026, 16:05:58 GMT

Recently, Miller et al. (2021) and Baek et al. (2022) empirically demonstrated strong linear correlations between in-distribution (ID) versus out-of-distribution (OOD) accuracy and agreement. These trends, coined accuracy-on-the-line (ACL) and agreement-on-the-line (AGL), enable OOD model selection and performance estimation without labeled data. However, these phenomena also break for certain shifts, such as CIFAR10-C Gaussian Noise, posing a critical bottleneck. In this paper, we make a key finding that recent test-time adaptation (TTA) methods not only improve OOD performance, but it drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before. To analyze this, we revisit the theoretical conditions from Miller et al. (2021) that outline the types of distribution shifts needed for perfect ACL in linear models. Surprisingly, these conditions are satisfied after applying TTA to deep models in the penultimate feature embedding space. In particular, TTA causes the data distribution to collapse complex shifts into those can be expressed by a singular scaling variable in the feature space. Our results show that by combining TTA with AGL-based estimation methods, we can estimate the OOD performance of models with high precision for a broader set of distribution shifts. This lends us a simple system for selecting the best hyperparameters and adaptation strategy without any OOD labeled data.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

d96fcc07d623a9eba68616629911143a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 08:25:52 GMT

agreement, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Communications (0.92)

Add feedback

DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

Shao, Weichuang, Liao, Iman Yi, Maul, Tomas Henrique Bode, Chandesa, Tissa

arXiv.org Artificial IntelligenceNov-25-2025

Audio classifiers frequently face domain shift, when models trained on one dataset lose accuracy on data recorded in acoustically different conditions. Previous Test-Time Adaptation (TTA) research in speech and sound analysis often evaluates models under fixed or mismatched noise settings, that fail to mimic real-world variability. To overcome these limitations, this paper presents DHAuDS (Dynamic and Heterogeneous Audio Domain Shift), a benchmark designed to assess TTA approaches under more realistic and diverse acoustic shifts. DHAuDS comprises four standardized benchmarks: UrbanSound8K-C, SpeechCommandsV2-C, VocalSound-C, and ReefSet-C, each constructed with dynamic corruption severity levels and heterogeneous noise types to simulate authentic audio degradation scenarios. The framework defines 14 evaluation criteria for each benchmark (8 for UrbanSound8K-C), resulting in 50 unrepeated criteria (124 experiments) that collectively enable fair, reproducible, and cross-domain comparison of TTA algorithms. Through the inclusion of dynamic and mixed-domain noise settings, DHAuDS offers a consistent and publicly reproducible testbed to support ongoing studies in robust and adaptive audio modeling.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.18421

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)

Add feedback

LoReTTA: A Low Resource Framework To Poison Continuous Time Dynamic Graphs

Pal, Himanshu, Bachina, Venkata Sai Pranav, Gangwal, Ankit, Sharma, Charu

arXiv.org Artificial IntelligenceNov-11-2025

Temporal Graph Neural Networks (TGNNs) are increasingly used in high-stakes domains, such as financial forecasting, recommendation systems, and fraud detection. However, their susceptibility to poisoning attacks poses a critical security risk. We introduce LoReTTA (Low Resource Two-phase Temporal Attack), a novel adversarial framework on Continuous-Time Dynamic Graphs, which degrades TGNN performance by an average of 29.47% across 4 widely benchmark datasets and 4 State-of-the-Art (SotA) models. LoReTTA operates through a two-stage approach: (1) sparsify the graph by removing high-impact edges using any of the 16 tested temporal importance metrics, (2) strategically replace removed edges with adversarial negatives via LoReTTA's novel degree-preserving negative sampling algorithm. Our plug-and-play design eliminates the need for expensive surrogate models while adhering to realistic unnoticeability constraints. LoReTTA degrades performance by upto 42.0% on MOOC, 31.5% on Wikipedia, 28.8% on UCI, and 15.6% on Enron. LoReTTA outperforms 11 attack baselines, remains undetectable to 4 leading anomaly detection systems, and is robust to 4 SotA adversarial defense training methods, establishing its effectiveness, unnoticeability, and robustness.

data mining, lorett, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2511.07379

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Education > Educational Setting > Online (0.88)
Education > Educational Technology > Educational Software > Computer Based Training (0.49)

Technology:

Information Technology > Information Management (0.96)
Information Technology > Communications > Social Media (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
(3 more...)

Add feedback

de0e668df3fe63ec89e5a7e68f3d350f-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 18:55:41 GMT

adaptation, cola, knowledge, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(6 more...)

Add feedback

Sample-Aware Test-Time Adaptation for Medical Image-to-Image Translation

Iele, Irene, Di Feola, Francesco, Guarrasi, Valerio, Soda, Paolo

arXiv.org Artificial IntelligenceSep-17-2025

Image-to-image translation has emerged as a powerful technique in medical imaging, enabling tasks such as image denoising and cross-modality conversion. However, it suffers from limitations in handling out-of-distribution samples without causing performance degradation. To address this limitation, we propose a novel Test-Time Adaptation (TTA) framework that dynamically adjusts the translation process based on the characteristics of each test sample. Our method introduces a Reconstruction Module to quantify the domain shift and a Dynamic Adaptation Block that selectively modifies the internal features of a pretrained translation model to mitigate the shift without compromising the performance on in-distribution samples that do not require adaptation. We evaluate our approach on two medical image-to-image translation tasks: low-dose CT denoising and T1 to T2 MRI translation, showing consistent improvements over both the baseline translation model without TTA and prior TTA methods. Our analysis highlights the limitations of the state-of-the-art that uniformly apply the adaptation to both out-of-distribution and in-distribution samples, demonstrating that dynamic, sample-specific adjustment offers a promising path to improve model resilience in real-world scenarios. The code is available at: https://github.com/Sample-Aware-TTA/Code.

artificial intelligence, configuration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.00766

Country: Europe (0.67)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Frustratingly Easy Test-Time Adaptation of Vision-Language Models

Neural Information Processing SystemsMay-27-2025, 20:19:17 GMT

Vision-Language Models seamlessly discriminate among arbitrary semantic categories, yet they still suffer from poor generalization when presented with challenging examples. For this reason, Episodic Test-Time Adaptation (TTA) strategies have recently emerged as powerful techniques to adapt VLMs in the presence of a single unlabeled image. The recent literature on TTA is dominated by the paradigm of prompt tuning by Marginal Entropy Minimization, which, relying on online backpropagation, inevitably slows down inference while increasing memory. In this work, we theoretically investigate the properties of this approach and unveil that a surprisingly strong TTA method lies dormant and hidden within it. We term this approach ZERO (TTA with "zero" temperature), whose design is both incredibly effective and frustratingly simple: augment N times, predict, retain the most confident predictions, and marginalize after setting the Softmax temperature to zero.

artificial intelligence, machine learning, natural language, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.97)
Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Artificial Intelligence > Vision (0.64)

Add feedback

Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line

Neural Information Processing SystemsMay-27-2025, 18:58:12 GMT

Recently, Miller et al. (2021) and Baek et al. (2022) empirically demonstrated strong linear correlations between in-distribution (ID) versus out-of-distribution (OOD) accuracy and agreement. These trends, coined accuracy-on-the-line (ACL) and agreement-on-the-line (AGL), enable OOD model selection and performance estimation without labeled data. However, these phenomena also break for certain shifts, such as CIFAR10-C Gaussian Noise, posing a critical bottleneck. In this paper, we make a key finding that recent test-time adaptation (TTA) methods not only improve OOD performance, but it drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before. To analyze this, we revisit the theoretical conditions from Miller et al. (2021) that outline the types of distribution shifts needed for perfect ACL in linear models.

distribution shift, stronger accuracy and agreement-on-the-line, test-time adaptation induce stronger accuracy, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Higher-Order Asymptotics of Test-Time Adaptation for Batch Normalization Statistics

Kimura, Masanari

arXiv.org Machine LearningMay-23-2025

This study develops a higher-order asymptotic framework for test-time adaptation (TTA) of Batch Normalization (BN) statistics under distribution shift by integrating classical Edgeworth expansion and saddlepoint approximation techniques with a novel one-step M-estimation perspective. By analyzing the statistical discrepancy between training and test distributions, we derive an Edgeworth expansion for the normalized difference in BN means and obtain an optimal weighting parameter that minimizes the mean-squared error of the adapted statistic. Reinterpreting BN TTA as a one-step M-estimator allows us to derive higher-order local asymptotic normality results, which incorporate skewness and other higher moments into the estimator's behavior. Moreover, we quantify the trade-offs among bias, variance, and skewness in the adaptation process and establish a corresponding generalization bound on the model risk. The refined saddlepoint approximations further deliver uniformly accurate density and tail probability estimates for the BN TTA statistic. These theoretical insights provide a comprehensive understanding of how higher-order corrections and robust one-step updating can enhance the reliability and performance of BN layers in adapting to changing data distributions.

artificial intelligence, machine learning, statistics, (18 more...)

arXiv.org Machine Learning

2505.16257

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

tta

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Frustratingly Easy Test-Time Adaptation of Vision-Language Models

Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line

d96fcc07d623a9eba68616629911143a-Paper-Conference.pdf

DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

LoReTTA: A Low Resource Framework To Poison Continuous Time Dynamic Graphs

de0e668df3fe63ec89e5a7e68f3d350f-Paper-Conference.pdf

Sample-Aware Test-Time Adaptation for Medical Image-to-Image Translation

Frustratingly Easy Test-Time Adaptation of Vision-Language Models

Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line

Higher-Order Asymptotics of Test-Time Adaptation for Batch Normalization Statistics