AITopics

2507.23426

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Park, Jun Won, Zhao, Kangyu, Rane, Sanket

Spatiodynamic inference using vision-based generative modelling

arXiv.org Machine LearningAug-1-2025

Biological systems commonly exhibit complex spatiotemporal patterns whose underlying generative mechanisms pose a significant analytical challenge. Traditional approaches to spatiodynamic inference rely on dimensionality reduction through summary statistics, which sacrifice complexity and interdependent structure intrinsic to these data in favor of parameter identifiability. This imposes a fundamental constraint on reliably extracting mechanistic insights from spatiotemporal data, highlighting the need for analytical frameworks that preserve the full richness of these dynamical systems. To address this, we developed a simulation-based inference framework that employs vision transformer-driven variational encoding to generate compact representations of the data, exploiting the inherent contextual dependencies. These representations are subsequently integrated into a likelihood-free Bayesian approach for parameter inference. The central idea is to construct a fine-grained, structured mesh of latent representations from simulated dynamics through systematic exploration of the parameter space. This encoded mesh of latent embeddings then serves as a reference map for retrieving parameter values that correspond to observed data. By integrating generative modeling with Bayesian principles, our approach provides a unified inference framework to identify both spatial and temporal patterns that manifest in multivariate dynamical systems.

artificial intelligence, machine learning, natural language, (19 more...)

2507.22256

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Epidemiology (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
(2 more...)

arXiv.org Artificial IntelligenceAug-1-2025

Incorporating structural uncertainty in causal decision making

Kaptein, Maurits

Practitioners making decisions based on causal effects typically ignore structural uncertainty. We analyze when this uncertainty is consequential enough to warrant methodological solutions (Bayesian model averaging over competing causal structures). Focusing on bivariate relationships ($X \rightarrow Y$ vs. $X \leftarrow Y$), we establish that model averaging is beneficial when: (1) structural uncertainty is moderate to high, (2) causal effects differ substantially between structures, and (3) loss functions are sufficiently sensitive to the size of the causal effect. We prove optimality results of our suggested methodological solution under regularity conditions and demonstrate through simulations that modern causal discovery methods can provide, within limits, the necessary quantification. Our framework complements existing robust causal inference approaches by addressing a distinct source of uncertainty typically overlooked in practice.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

2507.23495

Country: Europe (0.28)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Abdaljalil, Samir, Kurban, Hasan, Qaraqe, Khalid, Serpedin, Erchin

Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models

arXiv.org Artificial IntelligenceAug-1-2025

Large language models (LLMs) have shown strong performance across natural language reasoning tasks, yet their reasoning processes remain brittle and difficult to interpret. Prompting techniques like Chain-of-Thought (CoT) enhance reliability by eliciting intermediate reasoning steps or aggregating multiple outputs. However, they lack mechanisms for enforcing logical structure and assessing internal coherence. We introduce Theorem-of-Thought (ToTh), a novel framework that models reasoning as collaboration among three parallel agents, each simulating a distinct mode of inference: abductive, deductive, and inductive. Each agent produces a reasoning trace, which is structured into a formal reasoning graph. To evaluate consistency, we apply Bayesian belief propagation guided by natural language inference (NLI), assigning confidence scores to each step. The most coherent graph is selected to derive the final answer. Experiments on symbolic (WebOfLies) and numerical (MultiArith) reasoning benchmarks show that ToTh consistently outperforms CoT, Self-Consistency, and CoT-Decoding across multiple LLMs, while producing interpretable and logically grounded reasoning chains. Our findings suggest a promising direction for building more robust and cognitively inspired LLM reasoning. The implementation is available at https://github.com/KurbanIntelligenceLab/theorem-of-thought.

large language model, machine learning, natural language, (18 more...)

2506.07106

Country:

North America > United States > Texas (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

arXiv.org Artificial IntelligenceAug-1-2025

Consensus-Driven Active Model Selection

Kay, Justin, Van Horn, Grant, Maji, Subhransu, Sheldon, Daniel, Beery, Sara

The widespread availability of off-the-shelf machine learning models poses a challenge: which model, of the many available candidates, should be chosen for a given data analysis task? This question of model selection is traditionally answered by collecting and annotating a validation dataset -- a costly and time-intensive process. We propose a method for active model selection, using predictions from candidate models to prioritize the labeling of test data points that efficiently differentiate the best candidate. Our method, CODA, performs consensus-driven active model selection by modeling relationships between classifiers, categories, and data points within a probabilistic framework. The framework uses the consensus and disagreement between models in the candidate pool to guide the label acquisition process, and Bayesian inference to update beliefs about which model is best as more information is collected. We validate our approach by curating a collection of 26 benchmark tasks capturing a range of model selection scenarios. CODA outperforms existing methods for active model selection significantly, reducing the annotation effort required to discover the best model by upwards of 70% compared to the previous state-of-the-art. Code and data are available at https://github.com/justinkay/coda.

artificial intelligence, machine learning, model selection, (17 more...)

2507.23771

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

arXiv.org Machine LearningJul-31-2025

LVM-GP: Uncertainty-Aware PDE Solver via coupling latent variable model and Gaussian process

Feng, Xiaodong, Guo, Ling, Wan, Xiaoliang, Wu, Hao, Zhou, Tao, Zhou, Wenwen

We propose a novel probabilistic framework, termed LVM-GP, for uncertainty quantification in solving forward and inverse partial differential equations (PDEs) with noisy data. The core idea is to construct a stochastic mapping from the input to a high-dimensional latent representation, enabling uncertainty-aware prediction of the solution. Specifically, the architecture consists of a confidence-aware encoder and a probabilistic decoder. The encoder implements a high-dimensional latent variable model based on a Gaussian process (LVM-GP), where the latent representation is constructed by interpolating between a learnable deterministic feature and a Gaussian process prior, with the interpolation strength adaptively controlled by a confidence function learned from data. The decoder defines a conditional Gaussian distribution over the solution field, where the mean is predicted by a neural operator applied to the latent representation, allowing the model to learn flexible function-to-function mapping. Moreover, physical laws are enforced as soft constraints in the loss function to ensure consistency with the underlying PDE structure. Compared to existing approaches such as Bayesian physics-informed neural networks (B-PINNs) and deep ensembles, the proposed framework can efficiently capture functional dependencies via merging a latent Gaussian process and neural operator, resulting in competitive predictive accuracy and robust uncertainty quantification. Numerical experiments demonstrate the effectiveness and reliability of the method.

artificial intelligence, machine learning, uncertainty quantification, (18 more...)

2507.22493

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Apollonio, Nicola, Franzina, Giovanni, Torrisi, Giovanni Luca

Simulating Posterior Bayesian Neural Networks with Dependent Weights

arXiv.org Machine LearningJul-31-2025

The theoretical study of Bayesian neural networks was initiated by Neal [29] who proved that if a shallow Bayesian neural network is initialized with independent Gaussian parameters (i.e., biases and weights), then the output of the network converges in distribution to a Gaussian process, as the number of neurons grows large ( i.e., in the wide width limit). This result was extended to Bayesian deep neural networks two decades later (see [16, 22, 26]) and only recently it has been made quantitative by the use of the optimal transport theory (see [6] and [33]), by the Stein method for Gaussian approximation (see [3, 4, 8, 13]), and by alternative techniques ([7, 11]). Another promising approach to analyze Bayesian neural networks is through the lens of large deviations. First results in this direction are given in [23]. These findings have been successively generalized in [2, 34]. A different perspective is provided by the so-called mean field analysis of networks (see [27, 15]). The advantage of the Bayesian framework is that it allows to include in the model both prior knowledge and observed data through a prior distribution on network's parameters and a likelihood function, respectively. The emergence of Gaussian processes helped to understand how large neural networks work, how to make them more efficient, and motivated the use of Bayesian regression inference methods, see [22]. However, as noticed by [28] and [21], the connection with Gaussian processes also highlighted the limitations of wide width neural networks with independent and Gaussian distributed weights.

artificial intelligence, machine learning, posterior, (20 more...)

2507.22095

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Italy (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Volpi, Lorenzo, Moreo, Alejandro, Sebastiani, Fabrizio

Transductive Model Selection under Prior Probability Shift

arXiv.org Artificial IntelligenceJul-31-2025

Transductive learning is a supervised machine learning task in which, unlike in traditional inductive learning, the unlabelled data that require labelling are a finite set and are available at training time. Similarly to inductive learning contexts, transductive learning contexts may be affected by dataset shift, i.e., may be such that the IID assumption does not hold. We here propose a method, tailored to transductive classification contexts, for performing model selection (i.e., hyperparameter optimisation) when the data exhibit prior probability shift, an important type of dataset shift typical of anti-causal learning problems. In our proposed method the hyperparameters can be optimised directly on the unlabelled data to which the trained classifier must be applied; this is unlike traditional model selection methods, that are based on performing cross-validation on the labelled training data. We provide experimental results that show the benefits brought about by our method.

artificial intelligence, classifier, machine learning, (15 more...)

2507.22647

Country: Europe (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.63)

Talluri, Kranthi Kumar, Weidl, Galia, Kasuluru, Vaishnavi

Accident-Driven Congestion Prediction and Simulation: An Explainable Framework Using Advanced Clustering and Bayesian Networks

arXiv.org Artificial IntelligenceJul-31-2025

Traffic congestion due to uncertainties, such as accidents, is a significant issue in urban areas, as the ripple effect of accidents causes longer delays, increased emissions, and safety concerns. To address this issue, we propose a robust framework for predicting the impact of accidents on congestion. We implement Automated Machine Learning (AutoML)-enhanced Deep Embedding Clustering (DEC) to assign congestion labels to accident data and predict congestion probability using a Bayesian Network (BN). The Simulation of Urban Mobility (SUMO) simulation is utilized to evaluate the correctness of BN predictions using evidence-based scenarios. Results demonstrate that the AutoML-enhanced DEC has outperformed traditional clustering approaches. The performance of the proposed BN model achieved an overall accuracy of 95.6%, indicating its ability to understand the complex relationship of accidents causing congestion. Validation in SUMO with evidence-based scenarios demonstrated that the BN model's prediction of congestion states closely matches those of SUMO, indicating the high reliability of the proposed BN model in ensuring smooth urban mobility.

artificial intelligence, congestion, machine learning, (19 more...)

2507.22529

Genre: Research Report (0.84)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceJul-31-2025

Proto-EVFL: Enhanced Vertical Federated Learning via Dual Prototype with Extremely Unaligned Data

Guo, Wei, Duan, Yiyang, Hu, Zhaojun, Tong, Yiqi, Zhuang, Fuzhen, Zhang, Xiao, Dong, Jin, Wu, Ruofan, Liu, Tengfei, Sun, Yifan

--In vertical federated learning (VFL), multiple enterprises address aligned sample scarcity by leveraging massive locally unaligned samples to facilitate collaborative learning. However, unaligned samples across different parties in VFL can be extremely class-imbalanced, leading to insufficient feature representation and limited model prediction space. Specifically, class-imbalanced problems consist of intra-party class imbalance and inter-party class imbalance, which can further cause local model bias and feature contribution inconsistency issues, respectively. T o address the above challenges, we propose Proto-EVFL, an enhanced VFL framework via dual prototypes. We first introduce class prototypes for each party to learn relationships between classes in the latent space, allowing the active party to predict unseen classes. We further design a probabilistic dual prototype learning scheme to dynamically select unaligned samples by conditional optimal transport cost with class prior probability. Moreover, a mixed prior guided module guides this selection process by combining local and global class prior probabilities. Finally, we adopt an adaptive gated feature aggregation strategy to mitigate feature contribution inconsistency by dynamically weighting and aggregating local features across different parties. We proved that Proto-EVFL, as the first bi-level optimization framework in VFL, has a convergence rate of 1 / T . Even in a zero-shot scenario with one unseen class, it outperforms baselines by at least 6.97%. NTRODUCTION indicates equal contribution, * represents the corresponding authors Wei Guo, Yiyang Duan and Fuzhen Zhuang are with the School of Artificial Intelligence, Beihang University, Beijing 100083, China (e-mail: { guowei, duanyiyang, zhuangfuzhen }@buaa.edu.cn). Xiao Zhang is with the School of Computer Science and Technology, Shan-dong University, Shandong 266237, China (e-mail: xiaozhang@sdu.edu.cn). Zhaojun Hu is with the Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing 100872, China (e-mail: huzhao-jun@ruc.edu.cn).

artificial intelligence, bayesian inference, machine learning, (16 more...)

2507.22488

Country: Asia > China > Beijing > Beijing (0.45)

Genre: Research Report > New Finding (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
(2 more...)