AITopics

2510.15548

Country: Asia > Middle East > UAE (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

arXiv.org Machine LearningOct-20-2025

Information Theory in Open-world Machine Learning Foundations, Frameworks, and Future Direction

Wang, Lin

Open world Machine Learning (OWML) aims to develop intelligent systems capable of recognizing known categories, rejecting unknown samples, and continually learning from novel information. Despite significant progress in open set recognition, novelty detection, and continual learning, the field still lacks a unified theoretical foundation that can quantify uncertainty, characterize information transfer, and explain learning adaptability in dynamic, nonstationary environments. This paper presents a comprehensive review of information theoretic approaches in open world machine learning, emphasizing how core concepts such as entropy, mutual information, and Kullback Leibler divergence provide a mathematical language for describing knowledge acquisition, uncertainty suppression, and risk control under open world conditions. We synthesize recent studies into three major research axes: information theoretic open set recognition enabling safe rejection of unknowns, information driven novelty discovery guiding new concept formation, and information retentive continual learning ensuring stable long term adaptation. Furthermore, we discuss theoretical connections between information theory and provable learning frameworks, including PAC Bayes bounds, open-space risk theory, and causal information flow, to establish a pathway toward provable and trustworthy open world intelligence. Finally, the review identifies key open problems and future research directions, such as the quantification of information risk, development of dynamic mutual information bounds, multimodal information fusion, and integration of information theory with causal reasoning and world model learning.

artificial intelligence, data mining, machine learning, (15 more...)

2510.15422

Country: Asia > China (0.28)

Genre: Overview (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

arXiv.org Machine LearningOct-20-2025

RankSEG-RMA: An Efficient Segmentation Algorithm via Reciprocal Moment Approximation

Wang, Zixun, Dai, Ben

Semantic segmentation labels each pixel in an image with its corresponding class, and is typically evaluated using the Intersection over Union (IoU) and Dice metrics to quantify the overlap between predicted and ground-truth segmentation masks. In the literature, most existing methods estimate pixel-wise class probabilities, then apply argmax or thresholding to obtain the final prediction. These methods have been shown to generally lead to inconsistent or suboptimal results, as they do not directly maximize segmentation metrics. To address this issue, a novel consistent segmentation framework, RankSEG, has been proposed, which includes RankDice and RankIoU specifically designed to optimize the Dice and IoU metrics, respectively. Although RankSEG almost guarantees improved performance, it suffers from two major drawbacks. First, it is its computational expense-RankDice has a complexity of O(d log d) with a substantial constant factor (where d represents the number of pixels), while RankIoU exhibits even higher complexity O(d^2), thus limiting its practical application. For instance, in LiTS, prediction with RankSEG takes 16.33 seconds compared to just 0.01 seconds with the argmax rule. Second, RankSEG is only applicable to overlapping segmentation settings, where multiple classes can occupy the same pixel, which contrasts with standard benchmarks that typically assume non-overlapping segmentation. In this paper, we overcome these two drawbacks via a reciprocal moment approximation (RMA) of RankSEG with the following contributions: (i) we improve RankSEG using RMA, namely RankSEG-RMA, reduces the complexity of both algorithms to O(d) while maintaining comparable performance; (ii) inspired by RMA, we develop a pixel-wise score function that allows efficient implementation for non-overlapping segmentation settings.

machine learning, natural language, segmentation, (17 more...)

2510.15362

Country: North America > Canada (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(3 more...)

da Silva, Eliezer, Klami, Arto, Mesquita, Diego, Urteaga, Iñigo

On the Identifiability of Tensor Ranks via Prior Predictive Matching

arXiv.org Machine LearningOct-17-2025

Selecting the latent dimensions (ranks) in tensor factorization is a central challenge that often relies on heuristic methods. This paper introduces a rigorous approach to determine rank identifiability in probabilistic tensor models, based on prior predictive moment matching. We transform a set of moment matching conditions into a log-linear system of equations in terms of marginal moments, prior hyperparameters, and ranks; establishing an equivalence between rank identifiability and the solvability of such system. We apply this framework to four foundational tensor-models, demonstrating that the linear structure of the PARAFAC/CP model, the chain structure of the Tensor Train model, and the closed-loop structure of the Tensor Ring model yield solvable systems, making their ranks identifiable. In contrast, we prove that the symmetric topology of the Tucker model leads to an underdetermined system, rendering the ranks unidentifiable by this method. For the identifiable models, we derive explicit closed-form rank estimators based on the moments of observed data only. We empirically validate these estimators and evaluate the robustness of the proposal.

artificial intelligence, machine learning, tensor, (16 more...)

2510.14523

Country:

South America > Brazil > São Paulo (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Glaser, Pierre, Widmann, David, Lindsten, Fredrik, Gretton, Arthur

Fast and Scalable Score-Based Kernel Calibration Tests

arXiv.org Machine LearningOct-17-2025

We introduce the Kernel Calibration Conditional Stein Discrepancy test (KCCSD test), a non-parametric, kernel-based test for assessing the calibration of probabilistic models with well-defined scores. In contrast to previous methods, our test avoids the need for possibly expensive expectation approximations while providing control over its type-I error. We achieve these improvements by using a new family of kernels for score-based probabilities that can be estimated without probability density samples, and by using a conditional goodness-of-fit criterion for the KCCSD test's U-statistic. We demonstrate the properties of our test on various synthetic settings.

artificial intelligence, kernel, machine learning, (13 more...)

2510.14711

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Klar, Markus, Stein, Sebastian, Paterson, Fraser, Williamson, John H., Murray-Smith, Roderick

An Active Inference Model of Mouse Point-and-Click Behaviour

We explore the use of Active Inference (AIF) as a computational user model for spatial pointing, a key problem in Human-Computer Interaction (HCI). We present an AIF agent with continuous state, action, and observation spaces, performing one-dimensional mouse pointing and clicking. We use a simple underlying dynamic system to model the mouse cursor dynamics with realistic perceptual delay. In contrast to previous optimal feedback control-based models, the agent's actions are selected by minimizing Expected Free Energy, solely based on preference distributions over percepts, such as observing clicking a button correctly. Our results show that the agent creates plausible pointing movements and clicks when the cursor is over the target, with similar end-point variance to human users. In contrast to other models of pointing, we incorporate fully probabilistic, predictive delay compensation into the agent. The agent shows distinct behaviour for differing target difficulties without the need to retune system parameters, as done in other approaches. We discuss the simulation results and emphasize the challenges in identifying the correct configuration of an AIF agent interacting with continuous systems.

artificial intelligence, machine learning, modeling & simulation, (14 more...)

2510.14611

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Marconato, Emanuele, Bortolotti, Samuele, van Krieken, Emile, Morettin, Paolo, Umili, Elena, Vergari, Antonio, Tsamoura, Efthymia, Passerini, Andrea, Teso, Stefano

Symbol Grounding in Neuro-Symbolic AI: A Gentle Introduction to Reasoning Shortcuts

Neuro-symbolic (NeSy) AI aims to develop deep neural networks whose predictions comply with prior knowledge encoding, e.g. safety or structural constraints. As such, it represents one of the most promising avenues for reliable and trustworthy AI. The core idea behind NeSy AI is to combine neural and symbolic steps: neural networks are typically responsible for mapping low-level inputs into high-level symbolic concepts, while symbolic reasoning infers predictions compatible with the extracted concepts and the prior knowledge. Despite their promise, it was recently shown that - whenever the concepts are not supervised directly - NeSy models can be affected by Reasoning Shortcuts (RSs). That is, they can achieve high label accuracy by grounding the concepts incorrectly. RSs can compromise the interpretability of the model's explanations, performance in out-of-distribution scenarios, and therefore reliability. At the same time, RSs are difficult to detect and prevent unless concept supervision is available, which is typically not the case. However, the literature on RSs is scattered, making it difficult for researchers and practitioners to understand and tackle this challenging problem. This overview addresses this issue by providing a gentle introduction to RSs, discussing their causes and consequences in intuitive terms. It also reviews and elucidates existing theoretical characterizations of this phenomenon. Finally, it details methods for dealing with RSs, including mitigation and awareness strategies, and maps their benefits and limitations. By reformulating advanced material in a digestible form, this overview aims to provide a unifying perspective on RSs to lower the bar to entry for tackling them. Ultimately, we hope this overview contributes to the development of reliable NeSy and trustworthy AI models.

artificial intelligence, logic & formal reasoning, machine learning, (17 more...)

2510.14538

Country:

North America > United States (0.67)
Europe > Italy (0.46)
Europe > United Kingdom > England (0.27)

Genre: Overview (0.88)

Industry: Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

Hiremath, Sujai, Janzing, Dominik, Faller, Philipp, Blöbaum, Patrick, Kirschbaum, Elke, Kasiviswanathan, Shiva Prasad, Gan, Kyra

From Guess2Graph: When and How Can Unreliable Experts Safely Boost Causal Discovery in Finite Samples?

Causal discovery algorithms often perform poorly with limited samples. While integrating expert knowledge (including from LLMs) as constraints promises to improve performance, guarantees for existing methods require perfect predictions or uncertainty estimates, making them unreliable for practical use. We propose the Guess2Graph (G2G) framework, which uses expert guesses to guide the sequence of statistical tests rather than replacing them. This maintains statistical consistency while enabling performance improvements. We develop two instantiations of G2G: PC-Guess, which augments the PC algorithm, and gPC-Guess, a learning-augmented variant designed to better leverage high-quality expert input. Theoretically, both preserve correctness regardless of expert error, with gPC-Guess provably outperforming its non-augmented counterpart in finite samples when experts are "better than random."

artificial intelligence, machine learning, natural language, (20 more...)

2510.14488

Country:

North America > United States > New York (0.27)
South America > Brazil (0.27)
Europe (0.27)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
(3 more...)

Merris, Matthew D., Andersen, Tim

Data Understanding Survey: Pursuing Improved Dataset Characterization Via Tensor-based Methods

In the evolving domains of Machine Learning and Data Analytics, existing dataset characterization methods such as statistical, structural, and model-based analyses often fail to deliver the deep understanding and insights essential for innovation and explainability. This work surveys the current state-of-the-art conventional data analytic techniques and examines their limitations, and discusses a variety of tensor-based methods and how these may provide a more robust alternative to traditional statistical, structural, and model-based dataset characterization techniques. Through examples, we illustrate how tensor methods unveil nuanced data characteristics, offering enhanced interpretability and actionable intelligence. We advocate for the adoption of tensor-based characterization, promising a leap forward in understanding complex datasets and paving the way for intelligent, explainable data-driven discoveries.

data mining, machine learning, natural language, (19 more...)

2510.14161

Country:

North America > United States (1.00)
Europe (0.92)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(4 more...)

Briding Diffusion Posterior Sampling and Monte Carlo methods: a survey

Janati, Yazid, Durmus, Alain, Olsson, Jimmy, Moulines, Eric

Diffusion models enable the synthesis of highly accurate samples from complex distributions and have become foundational in generative modeling. Recently, they have demonstrated significant potential for solving Bayesian inverse problems by serving as priors. This review offers a comprehensive overview of current methods that leverage \emph{pre-trained} diffusion models alongside Monte Carlo methods to address Bayesian inverse problems without requiring additional training. We show that these methods primarily employ a \emph{twisting} mechanism for the intermediate distributions within the diffusion process, guiding the simulations toward the posterior distribution. We describe how various Monte Carlo methods are then used to aid in sampling from these twisted distributions.

approximation, artificial intelligence, machine learning, (16 more...)

2510.14114

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)