AITopics | testing split

Collaborating Authors

testing split

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AnoShift: ADistributionShiftBenchmarkfor UnsupervisedAnomalyDetection

Neural Information Processing SystemsFeb-12-2026, 03:17:03 GMT

Theexisting benchmarks arefocused onsupervised learning, andtothe best of our knowledge, there is none for unsupervised learning.

data mining, distribution shift, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.07)
Europe > Romania (0.04)

Industry: Information Technology > Security & Privacy (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.96)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.33)

Add feedback

VISAT: Benchmarking Adversarial and Distribution Shift Robustness in Traffic Sign Recognition with Visual Attributes

Yu, Simon, Yu, Peilin, Zheng, Hongbo, Shao, Huajie, Zhao, Han, Sha, Lui

arXiv.org Artificial IntelligenceNov-3-2025

We present VISAT, a novel open dataset and benchmarking suite for evaluating model robustness in the task of traffic sign recognition with the presence of visual attributes. Built upon the Mapillary Traffic Sign Dataset (MTSD), our dataset introduces two benchmarks that respectively emphasize robustness against adversarial attacks and distribution shifts. For our adversarial attack benchmark, we employ the state-of-the-art Projected Gradient Descent (PGD) method to generate adversarial inputs and evaluate their impact on popular models. Additionally, we investigate the effect of adversarial attacks on attribute-specific multi-task learning (MTL) networks, revealing spurious correlations among MTL tasks. The MTL networks leverage visual attributes (color, shape, symbol, and text) that we have created for each traffic sign in our dataset. For our distribution shift benchmark, we utilize ImageNet-C's realistic data corruption and natural variation techniques to perform evaluations on the robustness of both base and MTL models. Moreover, we further explore spurious correlations among MTL tasks through synthetic alterations of traffic sign colors using color quantization techniques. Our experiments focus on two major backbones, ResNet-152 and ViT-B/32, and compare the performance between base and MTL models. The VISAT dataset and benchmarking framework contribute to the understanding of model robustness for traffic sign recognition, shedding light on the challenges posed by adversarial attacks and distribution shifts. We believe this work will facilitate advancements in developing more robust models for real-world applications in autonomous driving and cyber-physical systems.

artificial intelligence, individual imagenet-c data corruption attack, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.26833

Country: Europe (0.67)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Transportation > Ground > Road (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

AnoShift: A Distribution Shift Benchmark for Unsupervised Anomaly Detection

Neural Information Processing SystemsAug-19-2025, 05:28:36 GMT

Next, we propose AnoShift, a protocol splitting the data in IID, NEAR, and FAR testing splits.

data mining, distribution shift, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.07)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Appendix: Lifelong Domain Adaptation via Consolidated Internal Distribution

Neural Information Processing SystemsAug-14-2025, 17:29:35 GMT

In Figure 1, we see the high-level description of the lifelong UDA approach. Lifelong learning is an iterative process in which the model is updated persistently. Upon training on a source domain with labeled data, the input data is transformed into a multi-modal distribution in the embedding space.

consolidated internal distribution, lifelong domain adaptation, source domain, (11 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models

Chiu, Hsu-kuang, Hachiuma, Ryo, Wang, Chien-Yi, Smith, Stephen F., Wang, Yu-Chiang Frank, Chen, Min-Hung

arXiv.org Artificial IntelligenceFeb-17-2025

Current autonomous driving vehicles rely mainly on their individual sensors to understand surrounding scenes and plan for future trajectories, which can be unreliable when the sensors are malfunctioning or occluded. To address this problem, cooperative perception methods via vehicle-to-vehicle (V2V) communication have been proposed, but they have tended to focus on detection and tracking. How those approaches contribute to overall cooperative planning performance is still under-explored. Inspired by recent progress using Large Language Models (LLMs) to build autonomous driving systems, we propose a novel problem setting that integrates an LLM into cooperative autonomous driving, with the proposed Vehicle-to-Vehicle Question-Answering (V2V-QA) dataset and benchmark. We also propose our baseline method Vehicle-to-Vehicle Large Language Model (V2V-LLM), which uses an LLM to fuse perception information from multiple connected autonomous vehicles (CAVs) and answer driving-related questions: grounding, notable object identification, and planning. Experimental results show that our proposed V2V-LLM can be a promising unified model architecture for performing various tasks in cooperative autonomous driving, and outperforms other baseline methods that use different fusion approaches. Our work also creates a new research direction that can improve the safety of future autonomous driving systems. Our project website: https://eddyhkchiu.github.io/v2vllm.github.io/ .

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.0998

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

VEMOCLAP: A video emotion classification web application

Sulun, Serkan, Viana, Paula, Davies, Matthew E. P.

arXiv.org Artificial IntelligenceOct-22-2024

We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting pretrained features using multi-head cross-attention. Our approach increases the state-of-the-art classification accuracy on the Ekman-6 video emotion dataset by 4.3% and offers an online application for users to run our model on their own videos or YouTube videos. We invite the readers to try our application at serkansulun.com/app.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.21303

Country: Europe > Portugal > Porto > Porto (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Clinical Language Understanding Evaluation (CLUE)

Goodwin, Travis R., Demner-Fushman, Dina

arXiv.org Artificial IntelligenceSep-28-2022

Clinical language processing has received a lot of attention in recent years, resulting in new models or methods for disease phenotyping, mortality prediction, and other tasks. Unfortunately, many of these approaches are tested under different experimental settings (e.g., data sources, training and testing splits, metrics, evaluation criteria, etc.) making it difficult to compare approaches and determine state-of-the-art. To address these issues and facilitate reproducibility and comparison, we present the Clinical Language Understanding Evaluation (CLUE) benchmark with a set of four clinical language understanding tasks, standard training, development, validation and testing sets derived from MIMIC data, as well as a software toolkit. It is our hope that these data will enable direct comparison between approaches, improve reproducibility, and reduce the barrier-to-entry for developing novel models or methods for these clinical language understanding tasks.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2209.14377

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Providers & Services (0.96)
Health & Medicine > Therapeutic Area > Nephrology (0.96)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning to Split for Automatic Bias Detection

Bao, Yujia, Barzilay, Regina

arXiv.org Artificial IntelligenceJul-20-2022

Classifiers are biased when trained on biased datasets. As a remedy, we propose Learning to Split (ls), an algorithm for automatic bias detection. Given a dataset with input-label pairs, ls learns to split this dataset so that predictors trained on the training split cannot generalize to the testing split. This performance gap suggests that the testing split is under-represented in the dataset, which is a signal of potential bias. Identifying non-generalizable splits is challenging since we have no annotations about the bias. In this work, we show that the prediction correctness of each example in the testing split can be used as a source of weak supervision: generalization performance will drop if we move examples that are predicted correctly away from the testing split, leaving only those that are mis-predicted. ls is task-agnostic and can be applied to any supervised learning problem, ranging from natural language understanding and image classification to molecular property prediction. Empirical results show that ls is able to generate astonishingly challenging splits that correlate with human-identified biases. Moreover, we demonstrate that combining robust learning algorithms (such as group DRO) with splits identified by ls enables automatic de-biasing. Compared to previous state-of-the-art, we substantially improve the worst-group performance (23.4% on average) when the source of biases is unknown during training and validation.

predictor, splitter, testing split, (16 more...)

arXiv.org Artificial Intelligence

2204.13749

Country: