AITopics

2507.23771

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Apollonio, Nicola, Franzina, Giovanni, Torrisi, Giovanni Luca

Simulating Posterior Bayesian Neural Networks with Dependent Weights

arXiv.org Machine LearningJul-31-2025

The theoretical study of Bayesian neural networks was initiated by Neal [29] who proved that if a shallow Bayesian neural network is initialized with independent Gaussian parameters (i.e., biases and weights), then the output of the network converges in distribution to a Gaussian process, as the number of neurons grows large ( i.e., in the wide width limit). This result was extended to Bayesian deep neural networks two decades later (see [16, 22, 26]) and only recently it has been made quantitative by the use of the optimal transport theory (see [6] and [33]), by the Stein method for Gaussian approximation (see [3, 4, 8, 13]), and by alternative techniques ([7, 11]). Another promising approach to analyze Bayesian neural networks is through the lens of large deviations. First results in this direction are given in [23]. These findings have been successively generalized in [2, 34]. A different perspective is provided by the so-called mean field analysis of networks (see [27, 15]). The advantage of the Bayesian framework is that it allows to include in the model both prior knowledge and observed data through a prior distribution on network's parameters and a likelihood function, respectively. The emergence of Gaussian processes helped to understand how large neural networks work, how to make them more efficient, and motivated the use of Bayesian regression inference methods, see [22]. However, as noticed by [28] and [21], the connection with Gaussian processes also highlighted the limitations of wide width neural networks with independent and Gaussian distributed weights.

artificial intelligence, machine learning, posterior, (20 more...)

arXiv.org Machine Learning

2507.22095

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Italy (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Volpi, Lorenzo, Moreo, Alejandro, Sebastiani, Fabrizio

Transductive Model Selection under Prior Probability Shift

Transductive learning is a supervised machine learning task in which, unlike in traditional inductive learning, the unlabelled data that require labelling are a finite set and are available at training time. Similarly to inductive learning contexts, transductive learning contexts may be affected by dataset shift, i.e., may be such that the IID assumption does not hold. We here propose a method, tailored to transductive classification contexts, for performing model selection (i.e., hyperparameter optimisation) when the data exhibit prior probability shift, an important type of dataset shift typical of anti-causal learning problems. In our proposed method the hyperparameters can be optimised directly on the unlabelled data to which the trained classifier must be applied; this is unlike traditional model selection methods, that are based on performing cross-validation on the labelled training data. We provide experimental results that show the benefits brought about by our method.

artificial intelligence, classifier, machine learning, (15 more...)

2507.22647

Country: Europe (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.63)

Talluri, Kranthi Kumar, Weidl, Galia, Kasuluru, Vaishnavi

Accident-Driven Congestion Prediction and Simulation: An Explainable Framework Using Advanced Clustering and Bayesian Networks

Traffic congestion due to uncertainties, such as accidents, is a significant issue in urban areas, as the ripple effect of accidents causes longer delays, increased emissions, and safety concerns. To address this issue, we propose a robust framework for predicting the impact of accidents on congestion. We implement Automated Machine Learning (AutoML)-enhanced Deep Embedding Clustering (DEC) to assign congestion labels to accident data and predict congestion probability using a Bayesian Network (BN). The Simulation of Urban Mobility (SUMO) simulation is utilized to evaluate the correctness of BN predictions using evidence-based scenarios. Results demonstrate that the AutoML-enhanced DEC has outperformed traditional clustering approaches. The performance of the proposed BN model achieved an overall accuracy of 95.6%, indicating its ability to understand the complex relationship of accidents causing congestion. Validation in SUMO with evidence-based scenarios demonstrated that the BN model's prediction of congestion states closely matches those of SUMO, indicating the high reliability of the proposed BN model in ensuring smooth urban mobility.

artificial intelligence, congestion, machine learning, (19 more...)

2507.22529

Genre: Research Report (0.84)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Proto-EVFL: Enhanced Vertical Federated Learning via Dual Prototype with Extremely Unaligned Data

Guo, Wei, Duan, Yiyang, Hu, Zhaojun, Tong, Yiqi, Zhuang, Fuzhen, Zhang, Xiao, Dong, Jin, Wu, Ruofan, Liu, Tengfei, Sun, Yifan

--In vertical federated learning (VFL), multiple enterprises address aligned sample scarcity by leveraging massive locally unaligned samples to facilitate collaborative learning. However, unaligned samples across different parties in VFL can be extremely class-imbalanced, leading to insufficient feature representation and limited model prediction space. Specifically, class-imbalanced problems consist of intra-party class imbalance and inter-party class imbalance, which can further cause local model bias and feature contribution inconsistency issues, respectively. T o address the above challenges, we propose Proto-EVFL, an enhanced VFL framework via dual prototypes. We first introduce class prototypes for each party to learn relationships between classes in the latent space, allowing the active party to predict unseen classes. We further design a probabilistic dual prototype learning scheme to dynamically select unaligned samples by conditional optimal transport cost with class prior probability. Moreover, a mixed prior guided module guides this selection process by combining local and global class prior probabilities. Finally, we adopt an adaptive gated feature aggregation strategy to mitigate feature contribution inconsistency by dynamically weighting and aggregating local features across different parties. We proved that Proto-EVFL, as the first bi-level optimization framework in VFL, has a convergence rate of 1 / T . Even in a zero-shot scenario with one unseen class, it outperforms baselines by at least 6.97%. NTRODUCTION indicates equal contribution, * represents the corresponding authors Wei Guo, Yiyang Duan and Fuzhen Zhuang are with the School of Artificial Intelligence, Beihang University, Beijing 100083, China (e-mail: { guowei, duanyiyang, zhuangfuzhen }@buaa.edu.cn). Xiao Zhang is with the School of Computer Science and Technology, Shan-dong University, Shandong 266237, China (e-mail: xiaozhang@sdu.edu.cn). Zhaojun Hu is with the Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing 100872, China (e-mail: huzhao-jun@ruc.edu.cn).

artificial intelligence, bayesian inference, machine learning, (16 more...)

2507.22488

Country: Asia > China > Beijing > Beijing (0.45)

Genre: Research Report > New Finding (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
(2 more...)

Yu, Yifan, Xiu, Shengjie, Palomar, Daniel P.

Robust Filtering and Learning in State-Space Models: Skewness and Heavy Tails Via Asymmetric Laplace Distribution

State-space models are pivotal for dynamic system analysis but often struggle with outlier data that deviates from Gaussian distributions, frequently exhibiting skewness and heavy tails. This paper introduces a robust extension utilizing the asymmetric Laplace distribution, specifically tailored to capture these complex characteristics. We propose an efficient variational Bayes algorithm and a novel single-loop parameter estimation strategy, significantly enhancing the efficiency of the filtering, smoothing, and parameter estimation processes. Our comprehensive experiments demonstrate that our methods provide consistently robust performance across various noise settings without the need for manual hyperparameter adjustments. In stark contrast, existing models generally rely on specific noise conditions and necessitate extensive manual tuning. Moreover, our approach uses far fewer computational resources, thereby validating the model's effectiveness and underscoring its potential for practical applications in fields such as robust control and financial modeling.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2507.22343

Country:

North America > United States (1.00)
Europe (1.00)
Asia > China (0.67)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Sitdhipol, Supawich, Sukprasongdee, Waritwong, Chuangsuwanich, Ekapol, Tse, Rina

Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations

Fusing information from human observations can help robots overcome sensing limitations in collaborative tasks. However, an uncertainty-aware fusion framework requires a grounded likelihood representing the uncertainty of human inputs. This paper presents a Feature Pyramid Likelihood Grounding Network (FP-LGN) that grounds spatial language by learning relevant map image features and their relationships with spatial relation semantics. The model is trained as a probability estimator to capture aleatoric uncertainty in human language using three-stage curriculum learning. Results showed that FP-LGN matched expert-designed rules in mean Negative Log-Likelihood (NLL) and demonstrated greater robustness with lower standard deviation. Collaborative sensing results demonstrated that the grounded likelihood successfully enabled uncertainty-aware fusion of heterogeneous human language observations and robot sensor measurements, achieving significant improvements in human-robot collaborative task performance.

artificial intelligence, likelihood, machine learning, (17 more...)

2507.19947

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Commercial Services & Supplies > Security & Alarm Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.68)
(2 more...)

Aich, Agnideep, Aich, Ashit Baran, Wade, Bruce

Measuring Sample Quality with Copula Discrepancies

arXiv.org Machine LearningJul-30-2025

The scalable Markov chain Monte Carlo (MCMC) algorithms that underpin modern Bayesian machine learning, such as Stochastic Gradient Langevin Dynamics (SGLD), sacrifice asymptotic exactness for computational speed, creating a critical diagnostic gap: traditional sample quality measures fail catastrophically when applied to biased samplers. While powerful Stein-based diagnostics can detect distributional mismatches, they provide no direct assessment of dependence structure, often the primary inferential target in multivariate problems. We introduce the Copula Discrepancy (CD), a principled and computationally efficient diagnostic that leverages Sklar's theorem to isolate and quantify the fidelity of a sample's dependence structure independent of its marginals. Our theoretical framework provides the first structure-aware diagnostic specifically designed for the era of approximate inference. Empirically, we demonstrate that a moment-based CD dramatically outperforms standard diagnostics like effective sample size for hyperparameter selection in biased MCMC, correctly identifying optimal configurations where traditional methods fail. Furthermore, our robust MLE-based variant can detect subtle but critical mismatches in tail dependence that remain invisible to rank correlation-based approaches, distinguishing between samples with identical Kendall's tau but fundamentally different extreme-event behavior. With computational overhead orders of magnitude lower than existing Stein discrepancies, the CD provides both immediate practical value for MCMC practitioners and a theoretical foundation for the next generation of structure-aware sample quality assessment.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2507.21434

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
(7 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Guan, Muhan, Farokhi, Farhad, Zhu, Jingge

A DPI-PAC-Bayesian Framework for Generalization Bounds

arXiv.org Machine LearningJul-30-2025

We develop a unified Data Processing Inequality PAC-Bayesian framework -- abbreviated DPI-PAC-Bayesian -- for deriving generalization error bounds in the supervised learning setting. By embedding the Data Processing Inequality (DPI) into the change-of-measure technique, we obtain explicit bounds on the binary Kullback-Leibler generalization gap for both Rényi divergence and any $f$-divergence measured between a data-independent prior distribution and an algorithm-dependent posterior distribution. We present three bounds derived under our framework using Rényi, Hellinger $p$ and Chi-Squared divergences. Additionally, our framework also demonstrates a close connection with other well-known bounds. When the prior distribution is chosen to be uniform, our bounds recover the classical Occam's Razor bound and, crucially, eliminate the extraneous $\log(2\sqrt{n})/n$ slack present in the PAC-Bayes bound, thereby achieving tighter results. The framework thus bridges data-processing and PAC-Bayesian perspectives, providing a flexible, information-theoretic tool to construct generalization guarantees.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2507.14795

Country:

Oceania > Australia > Victoria (0.04)
North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Pojer, Raffaele, Passerini, Andrea, Larsen, Kim G., Jaeger, Manfred

A Neuro-Symbolic Approach for Probabilistic Reasoning on Graph Data

arXiv.org Artificial IntelligenceJul-30-2025

Graph neural networks (GNNs) excel at predictive tasks on graph-structured data but often lack the ability to incorporate symbolic domain knowledge and perform general reasoning. Relational Bayesian Networks (RBNs), in contrast, enable fully generative probabilistic modeling over graph-like structures and support rich symbolic knowledge and probabilistic inference. This paper presents a neuro-symbolic framework that seamlessly integrates GNNs into RBNs, combining the learning strength of GNNs with the flexible reasoning capabilities of RBNs. We develop two implementations of this integration: one compiles GNNs directly into the native RBN language, while the other maintains the GNN as an external component. Both approaches preserve the semantics and computational properties of GNNs while fully aligning with the RBN modeling paradigm. We also propose a maximum a-posteriori (MAP) inference method for these neuro-symbolic models. To demonstrate the framework's versatility, we apply it to two distinct problems. First, we transform a GNN for node classification into a collective classification model that explicitly models homo- and heterophilic label patterns, substantially improving accuracy. Second, we introduce a multi-objective network optimization problem in environmental planning, where MAP inference supports complex decision-making. Both applications include new publicly available benchmark datasets. This work introduces a powerful and coherent neuro-symbolic approach to graph data, bridging learning and reasoning in ways that enable novel applications and improved performance across diverse tasks.

artificial intelligence, machine learning, node, (20 more...)

2507.21873

Country:

North America > United States (0.92)
Europe (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Food & Agriculture > Agriculture (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)