AVROBUSTBENCH: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time Sarthak Kumar Maharana Saksham Singh Kushwaha Baoming Zhang Adrian Rodriguez Songtao Wei Yapeng Tian

Jun-16-2026, 18:24:54 GMT–Neural Information Processing Systems

AVROBUSTBENCH comprises four audio-visual benchmark datasets, AUDIOSET-2C, VGGSOUND-2C, KINETICS-2C, and EPICKITCHENS-2C, each incorporating 75 bimodal audio-visual corruptions that are co-occurring and correlated. Through extensive evaluations, we observe that state-of-the-art supervised and severity self-supervised increases.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Jun-16-2026, 18:24:54 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.45)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language > Large Language Model (0.93)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found