AITopics | out-of-distribution

Collaborating Authors

out-of-distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

b8d1d741f137d9b6ac4f3c1683791e4a-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 13:56:04 GMT

different model, different number, out-of-distribution, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Towards Open-World Human Action Segmentation Using Graph Convolutional Networks

Xing, Hao, Boey, Kai Zhe, Cheng, Gordon

arXiv.org Artificial IntelligenceDec-12-2025

Human-object interaction segmentation is a fundamental task of daily activity understanding, which plays a crucial role in applications such as assistive robotics, healthcare, and autonomous systems. Most existing learning-based methods excel in closed-world action segmentation, they struggle to generalize to open-world scenarios where novel actions emerge. Collecting exhaustive action categories for training is impractical due to the dynamic diversity of human activities, necessitating models that detect and segment out-of-distribution actions without manual annotation. To address this issue, we formally define the open-world action segmentation problem and propose a structured framework for detecting and segmenting unseen actions. Our framework introduces three key innovations: 1) an Enhanced Pyramid Graph Convolutional Network (EPGCN) with a novel decoder module for robust spatiotemporal feature upsampling. 2) Mixup-based training to synthesize out-of-distribution data, eliminating reliance on manual annotations. 3) A novel Temporal Clustering loss that groups in-distribution actions while distancing out-of-distribution samples. We evaluate our framework on two challenging human-object interaction recognition datasets: Bimanual Actions and 2 Hands and Object (H2O) datasets. Experimental results demonstrate significant improvements over state-of-the-art action segmentation models across multiple open-set evaluation metrics, achieving 16.9% and 34.6% relative gains in open-set segmentation (F1@50) and out-of-distribution detection performances (AUROC), respectively. Additionally, we conduct an in-depth ablation study to assess the impact of each proposed component, identifying the optimal framework configuration for open-world action segmentation.

artificial intelligence, machine learning, recognition, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IROS60139.2025.11247257

2507.00756

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Bremen > Bremen (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Flow-Induced Diagonal Gaussian Processes

Lin, Moule, Patane, Andrea, Jing, Weipeng, Guan, Shuhao, Botterweck, Goetz

arXiv.org Artificial IntelligenceOct-6-2025

We present Flow-Induced Diagonal Gaussian Processes (FiD-GP), a compression framework that incorporates a compact inducing weight matrix to project a neural network's weight uncertainty into a lower-dimensional subspace. Critically, FiD-GP relies on normalising-flow priors and spectral regularisations to augment its expressiveness and align the inducing subspace with feature-gradient geometry through a numerically stable projection mechanism objective. Furthermore, we demonstrate how the prediction framework in FiD-GP can help to design a single-pass projection for Out-of-Distribution (OoD) detection. Our analysis shows that FiD-GP improves uncertainty estimation ability on various tasks compared with SVGP-based baselines, satisfies tight spectral residual bounds with theoretically guaranteed OoD detection, and significantly compresses the neural network's storage requirements at the cost of increased inference computation dependent on the number of inducing weights employed. Specifically, in a comprehensive empirical study spanning regression, image classification, semantic segmentation, and out-of-distribution detection benchmarks, it cuts Bayesian training cost by several orders of magnitude, compresses parameters by roughly 51%, reduces model size by about 75%, and matches state-of-the-art accuracy and uncertainty estimation.

data mining, machine learning, neural information processing system, (18 more...)

arXiv.org Artificial Intelligence

2509.17153

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
North America > Canada > Ontario > Toronto (0.14)
South America > Peru (0.04)
Asia > China > Heilongjiang Province (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust Taxi Fare Prediction Under Noisy Conditions: A Comparative Study of GAT, TimesNet, and XGBoost

Moorthy, Padmavathi

arXiv.org Artificial IntelligenceJul-29-2025

--Precise fare prediction is crucial in ride-hailing platforms and urban mobility systems. This study examines three machine learning models--Graph Attention Networks (GA T), XGBoost, and TimesNet--to evaluate their predictive capabilities for taxi fares using a real-world dataset comprising over 55 million records. Both raw (noisy) and denoised versions of the dataset are analyzed to assess the impact of data quality on model performance. The study evaluated the models along multiple axes, including predictive accuracy, calibration, uncertainty estimation, out-of-distribution (OOD) robustness, and feature sensitivity. We also explore pre-processing strategies, including KNN imputation, Gaussian noise injection, and autoencoder-based denoising. The study reveals critical differences between classical and deep learning models under realistic conditions, offering practical guidelines for building robust and scalable models in urban fare prediction systems. Index T erms--T axi Fare Prediction, Machine Learning, Graph Attention Network, XGBoost, Time Series, Uncertainty Estimation, Ensemble Models, Kolmogorov-Smirnov (KS), Out-of-Distribution (OOD). A. Background and Motivation Accurately estimating taxi fares plays a pivotal role in intelligent transportation systems and urban mobility planning.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.20008

Country: Africa > Rwanda > Kigali > Kigali (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors

Huang, Jing, Tao, Junyi, Icard, Thomas, Yang, Diyi, Potts, Christopher

arXiv.org Machine LearningMay-20-2025

Interpretability research now offers a variety of techniques for identifying abstract internal mechanisms in neural networks. Can such techniques be used to predict how models will behave on out-of-distribution examples? In this work, we provide a positive answer to this question. Through a diverse set of language modeling tasks--including symbol manipulation, knowledge retrieval, and instruction following--we show that the most robust features for correctness prediction are those that play a distinctive causal role in the model's behavior. Specifically, we propose two methods that leverage causal mechanisms to predict the correctness of model outputs: counterfactual simulation (checking whether key causal variables are realized) and value probing (using the values of those variables to make predictions). Both achieve high AUC-ROC in distribution and outperform methods that rely on causal-agnostic features in out-of-distribution settings, where predicting model behaviors is more crucial. Our work thus highlights a novel and significant application for internal causal analysis of language models.

large language model, machine learning, mechanism, (20 more...)

arXiv.org Machine Learning

2505.1177

Country:

Europe > Austria > Vienna (0.14)
South America > Suriname (0.05)
Asia > Japan (0.04)
(16 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Attend or Perish: Benchmarking Attention in Algorithmic Reasoning

Spiegel, Michal, Štefánik, Michal, Kadlčík, Marek, Kuchař, Josef

arXiv.org Artificial IntelligenceFeb-28-2025

Can transformers learn to perform algorithmic tasks reliably across previously unseen input/output domains? While pre-trained language models show solid accuracy on benchmarks incorporating algorithmic reasoning, assessing the reliability of these results necessitates an ability to cleanse models' functional capabilities from memorization. In this paper, we propose an algorithmic benchmark comprising six tasks of infinite input domains where we can also disentangle and trace the correct, robust algorithm necessary for the task. This allows us to assess (i) models' ability to extrapolate to unseen types of inputs, including new lengths, value ranges or input domains, but also (ii) to assess the robustness of the functional mechanism in recent models through the lens of their attention maps. We make the implementation of all our tasks and interoperability methods publicly available at https://github.com/michalspiegel/AttentionSpan .

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.01909

Country:

Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Out-of-distribution forgetting: vulnerability of continual learning to intra-class distribution shift

Guo, Liangxuan, Chen, Yang, Yu, Shan

arXiv.org Artificial IntelligenceJun-1-2023

Continual learning (CL) is an important technique to allow artificial neural networks to work in open environments. CL enables a system to learn new tasks without severe interference to its performance on old tasks, i.e., overcome the problems of catastrophic forgetting. In joint learning, it is well known that the out-of-distribution (OOD) problem caused by intentional attacks or environmental perturbations will severely impair the ability of networks to generalize. In this work, we reported a special form of catastrophic forgetting raised by the OOD problem in continual learning settings, and we named it out-of-distribution forgetting (OODF). In continual image classification tasks, we found that for a given category, introducing an intra-class distribution shift significantly impaired the recognition accuracy of CL methods for that category during subsequent learning. Interestingly, this phenomenon is special for CL as the same level of distribution shift had only negligible effects in the joint learning scenario. We verified that CL methods without dedicating subnetworks for individual tasks are all vulnerable to OODF. Moreover, OODF does not depend on any specific way of shifting the distribution, suggesting it is a risk for CL in a wide range of circumstances. Taken together, our work identified an under-attended risk during CL, highlighting the importance of developing approaches that can overcome OODF.

artificial intelligence, distribution shift, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.00427

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DOODLER: Determining Out-Of-Distribution Likelihood from Encoder Reconstructions

Kent, Jonathan S., Li, Bo

arXiv.org Machine LearningSep-27-2021

Deep Learning models possess two key traits that, in combination, make their use in the real world a risky prospect. One, they do not typically generalize well outside of the distribution for which they were trained, and two, they tend to exhibit confident behavior regardless of whether or not they are producing meaningful outputs. While Deep Learning possesses immense power to solve realistic, high-dimensional problems, these traits in concert make it difficult to have confidence in their real-world applications. To overcome this difficulty, the task of Out-Of-Distribution (OOD) Detection has been defined, to determine when a model has received an input from outside of the distribution for which it is trained to operate. This paper introduces and examines a novel methodology, DOODLER, for OOD Detection, which directly leverages the traits which result in its necessity. By training a Variational Auto-Encoder (VAE) on the same data as another Deep Learning model, the VAE learns to accurately reconstruct In-Distribution (ID) inputs, but not to reconstruct OOD inputs, meaning that its failure state can be used to perform OOD Detection. Unlike other work in the area, DOODLER requires only very weak assumptions about the existence of an OOD dataset, allowing for more realistic application. DOODLER also enables pixel-wise segmentations of input images by OOD likelihood, and experimental results show that it matches or outperforms methodologies that operate under the same constraints.

artificial intelligence, machine learning, reconstruction error, (16 more...)

arXiv.org Machine Learning

2109.13237

Country:

North America > United States > Illinois (0.04)
Europe > Poland (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks

Lee, Kimin, Lee, Kibok, Lee, Honglak, Shin, Jinwoo

arXiv.org Machine LearningJul-10-2018

Detecting test samples drawn sufficiently far away from the training distribution statistically or adversarially is a fundamental requirement to deploying a good classifier in many real-world machine learning applications. However, deep neural networks with the softmax classifier are known to produce highly overconfident posterior distributions even for such abnormal samples. In this paper, we propose a simple yet effective method for detecting any abnormal samples, which is applicable to any pre-trained softmax neural classifier. We obtain the class conditionalGaussian distributions with respect to (low- and upper-level) features of the deep models under Gaussian discriminant analysis, which result in a confidence score based on the Mahalanobis distance. While most prior methods have been evaluated for detecting either out-of-distribution or adversarial samples, but not both, the proposed method achieves the state-of-art performances for both cases in our experiments. Moreover, we found that our proposed method is more robust in extreme cases, e.g., when the training dataset has noisy labels or small number of samples. Finally, we show that the proposed method enjoys broader usage by applying it to class incremental learning: whenever out-of-distribution samples are detected, our classification rule can incorporate new classes well without further training deep models.

artificial intelligence, classifier, machine learning, (20 more...)

arXiv.org Machine Learning

1807.03888

Country:

North America > United States > Michigan (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback