AVROBUSTBENCH: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time Sarthak Kumar Maharana Saksham Singh Kushwaha Baoming Zhang Adrian Rodriguez Songtao Wei Yapeng Tian
–Neural Information Processing Systems
AVROBUSTBENCH comprises four audio-visual benchmark datasets, AUDIOSET-2C, VGGSOUND-2C, KINETICS-2C, and EPICKITCHENS-2C, each incorporating 75 bimodal audio-visual corruptions that are co-occurring and correlated. Through extensive evaluations, we observe that state-of-the-art supervised and severity self-supervised increases.
Neural Information Processing Systems
Jun-16-2026, 18:24:54 GMT
- Country:
- North America > United States (0.45)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Information Technology (0.93)
- Technology: