Bayesian Learning
Entropic Causal Inference: Graph Identifiability
Compton, Spencer, Greenewald, Kristjan, Katz, Dmitriy, Kocaoglu, Murat
Entropic causal inference is a recent framework for learning the causal graph between two variables from observational data by finding the information-theoretically simplest structural explanation of the data, i.e., the model with smallest entropy. In our work, we first extend the causal graph identifiability result in the two-variable setting under relaxed assumptions. We then show the first identifiability result using the entropic approach for learning causal graphs with more than two nodes. Our approach utilizes the property that ancestrality between a source node and its descendants can be determined using the bivariate entropic tests. We provide a sound sequential peeling algorithm for general graphs that relies on this property. We also propose a heuristic algorithm for small graphs that shows strong empirical performance. We rigorously evaluate the performance of our algorithms on synthetic data generated from a variety of models, observing improvement over prior work. Finally we test our algorithms on real-world datasets.
Information Geometry of Variational Bayes
We highlight a fundamental connection between information geometry and variational Bayes (VB) and discuss its consequences for machine learning. Under certain conditions, a VB solution always requires estimation or computation of natural gradients. We show several consequences of this fact by using the natural-gradient descent algorithm of Khan and Rue (2023) called the Bayesian Learning Rule (BLR). These include (i) a simplification of Bayes' rule as addition of natural gradients, (ii) a generalization of quadratic surrogates used in gradient-based methods, and (iii) a large-scale implementation of VB algorithms for large language models. Neither the connection nor its consequences are new but we further emphasize the common origins of the two fields of information geometry and Bayes with a hope to facilitate more work at the intersection of the two fields.
(SP)$^2$-Net: A Neural Spatial Spectrum Method for DOA Estimation
Berman, Lioz, Gannot, Sharon, Tirer, Tom
We consider the problem of estimating the directions of arrival (DOAs) of multiple sources from a single snapshot of an antenna array, a task with many practical applications. In such settings, the classical Bartlett beamformer is commonly used, as maximum likelihood estimation becomes impractical when the number of sources is unknown or large, and spectral methods based on the sample covariance are not applicable due to the lack of multiple snapshots. However, the accuracy and resolution of the Bartlett beamformer are fundamentally limited by the array aperture. In this paper, we propose a deep learning technique, comprising a novel architecture and training strategy, for generating a high-resolution spatial spectrum from a single snapshot. Specifically, we train a deep neural network that takes the measurements and a hypothesis angle as input and learns to output a score consistent with the capabilities of a much wider array. At inference time, a heatmap can be produced by scanning an arbitrary set of angles. We demonstrate the advantages of our trained model, named (SP)$^2$-Net, over the Bartlett beamformer and sparsity-based DOA estimation methods.
FloorSAM: SAM-Guided Floorplan Reconstruction with Semantic-Geometric Fusion
Ye, Han, Wang, Haofu, Zhang, Yunchi, Xiao, Jiangjian, Jin, Yuqiang, Liu, Jinyuan, Zhang, Wen-An, Sychou, Uladzislau, Tuzikov, Alexander, Sobolevskii, Vladislav, Zakharov, Valerii, Sokolov, Boris, Fu, Minglei
Abstract--Reconstructing building floor plans from point cloud data is a critical technology for indoor navigation, building information modeling (BIM), and highly accurate precise indoor measurement applications. Traditional methods, such as geometric algorithms and Mask R-CNN-based deep learning for mask segmentation, often suffer from sensitivity to noise, limited generalization, and loss of geometric details, severely impacting measurement accuracy. This study proposes an innovative framework, FloorSAM, that integrates room-height point cloud density maps with the guided segmentation capabilities of the Segment Anything Model (SAM) to enhance the precision of floor plan reconstruction from LiDAR point cloud data. By applying grid-based filtering to retain elevation point clouds near the ceiling of each region, combined with adaptive resolution projection and image enhancement techniques, a top-down density map is generated, improving the robustness and accuracy of spatial feature measurement. This framework leverages SAM's zero-shot learning to achieve high-fidelity room segmentation, remarkably enhancing reconstruction and measurement accuracy across diverse building layouts. Subsequently, leveraging SAM's zero-shot guided segmentation capabilities, high-quality room masks are generated based on adaptive prompt points, followed by a multistage filtering process to extract precise semantic masks for individual rooms. Through joint analysis of mask and point cloud modalities, contour extraction and regularization are performed, integrating semantic segmentation with geometric information to produce accurate room floor plans and recover topological relationships between rooms.
Distribution Estimation for Global Data Association via Approximate Bayesian Inference
Jia, Yixuan, Peterson, Mason B., Li, Qingyuan, Tian, Yulun, How, Jonathan P.
Abstract-- Global data association is an essential prerequisite for robot operation in environments seen at different times or by different robots. Repetitive or symmetric data creates significant challenges for existing methods, which typically rely on maximum likelihood estimation or maximum consensus to produce a single set of associations. However, in ambiguous scenarios, the distribution of solutions to global data association problems is often highly multimodal, and such single-solution approaches frequently fail. In this work, we introduce a data association framework that leverages approximate Bayesian inference to capture multiple solution modes to the data association problem, thereby avoiding premature commitment to a single solution under ambiguity. Our approach represents hypothetical solutions as particles that evolve according to a deterministic or randomized update rule to cover the modes of the underlying solution distribution. Furthermore, we show that our method can incorporate optimization constraints imposed by the data association formulation and directly benefit from GPU-parallelized optimization. Extensive simulated and real-world experiments with highly ambiguous data show that our method correctly estimates the distribution over transformations when registering point clouds or object maps. I. INTRODUCTION Data association is essential in many robotic applications, enabling key perception technologies such as dynamic object tracking [1]-[3] and simultaneous localization and mapping (SLAM) [4]-[6]. In these scenarios, robots must recognize when an object or feature they are currently observing is the same as something they (or another robot) may have seen from a different perspective. Without correct data association, the environment representation may be inconsistent, leading to undesirable behaviors in downstream tasks (e.g., incorrect associations in loop closure detection can lead to dramatically distorted maps [6]).
Universal Learning of Stochastic Dynamics for Exact Belief Propagation using Bernstein Normalizing Flows
Amorese, Peter, Lahijanian, Morteza
Predicting the distribution of future states in a stochastic system, known as belief propagation, is fundamental to reasoning under uncertainty. However, nonlinear dynamics often make analytical belief propagation intractable, requiring approximate methods. When the system model is unknown and must be learned from data, a key question arises: can we learn a model that (i) universally approximates general nonlinear stochastic dynamics, and (ii) supports analytical belief propagation? This paper establishes the theoretical foundations for a class of models that satisfy both properties. The proposed approach combines the expressiveness of normalizing flows for density estimation with the analytical tractability of Bernstein polynomials. Empirical results show the efficacy of our learned model over state-of-the-art data-driven methods for belief propagation, especially for highly non-linear systems with non-additive, non-Gaussian noise.
Trust-Aware Embodied Bayesian Persuasion for Mixed-Autonomy
Peng, Shaoting, Driggs-Campbell, Katherine, Dong, Roy
Safe and efficient interaction between autonomous vehicles (AVs) and human-driven vehicles (HVs) is a critical challenge for future transportation systems. While game-theoretic models capture how AVs influence HVs, they often suffer from a long-term decay of influence and can be perceived as manipulative, eroding the human's trust. This can paradoxically lead to riskier human driving behavior over repeated interactions. In this paper, we address this challenge by proposing the Trust-Aware Embodied Bayesian Persuasion (TA-EBP) framework. Our work makes three key contributions: First, we apply Bayesian persuasion to model communication at traffic intersections, offering a transparent alternative to traditional game-theoretic models. Second, we introduce a trust parameter to the persuasion framework, deriving a theorem for the minimum trust level required for influence. Finally, we ground the abstract signals of Bayesian persuasion theory into a continuous, physically meaningful action space, deriving a second theorem for the optimal signal magnitude, realized as an AV's forward nudge. Additionally, we validate our framework in a mixed-autonomy traffic simulation, demonstrating that TA-EBP successfully persuades HVs to drive more cautiously, eliminating collisions and improving traffic flow compared to baselines that either ignore trust or lack communication. Our work provides a transparent and non-strategic framework for influence in human-robot interaction, enhancing both safety and efficiency.
Consistent causal discovery with equal error variances: a least-squares perspective
Chaudhuri, Anamitra, Ni, Yang, Bhattacharya, Anirban
We consider the problem of recovering the true causal structure among a set of variables, generated by a linear acyclic structural equation model (SEM) with the error terms being independent and having equal variances. It is well-known that the true underlying directed acyclic graph (DAG) encoding the causal structure is uniquely identifiable under this assumption. In this work, we establish that the sum of minimum expected squared errors for every variable, while predicted by the best linear combination of its parent variables, is minimised if and only if the causal structure is represented by any supergraph of the true DAG. This property is further utilised to design a Bayesian DAG selection method that recovers the true graph consistently.
CausalPre: Scalable and Effective Data Pre-processing for Causal Fairness
Zheng, Ying, Jiang, Yangfan, Tan, Kian-Lee
Abstract--Causal fairness in databases is crucial to preventing biased and inaccurate outcomes in downstream tasks. While most prior work assumes a known causal model, recent efforts relax this assumption by enforcing additional constraints. However, these approaches often fail to capture broader attribute relationships that are critical to maintaining utility. This raises a fundamental question: Can we harness the benefits of causal reasoning to design efficient and effective fairness solutions without relying on strong assumptions about the underlying causal model? In this paper, we seek to answer this question by introducing CausalPre, a scalable and effective causality-guided data pre-processing framework that guarantees justifiable fairness, a strong causal notion of fairness. CausalPre extracts causally fair relationships by reformulating the originally complex and computationally infeasible extraction task into a tailored distribution estimation problem. T o ensure scalability, CausalPre adopts a carefully crafted variant of low-dimensional marginal factorization to approximate the joint distribution, complemented by a heuristic algorithm that efficiently tackles the associated computational challenge. Extensive experiments on benchmark datasets demonstrate that CausalPre is both effective and scalable, challenging the conventional belief that achieving causal fairness requires trading off relationship coverage for relaxed model assumptions. Machine learning (ML) systems are increasingly integrated into decision-making processes in domains such as education [1], finance [2], employment [3], advertising [4], and law enforcement [5], [6]. While these systems offer efficiency and scalability, they also pose serious concerns about fairness [7]- [14]. In particular, their reliance on historical data can unintentionally amplify biases, producing inaccurate, discriminatory outcomes with severe real-world impacts in high-stakes areas like criminal justice. These concerns have motivated the development of fairness-aware data pre-processing techniques within database management systems (DBMS) [15]-[22]. Compared to traditional fairness interventions at the model training or inference stages [23]-[28], pre-processing methods offer: (i) a once-for-all benefit, meaning that once data is calibrated for fairness, it can be used in any downstream task, regardless of the ML model employed; and (ii) a user-friendly workflow, as fairness considerations are directly embedded into the data pre-processing pipeline, enabling practitioners to focus on the downstream task without specialized fairness expertise. A straightforward approach to achieve this is to remove all sensitive attributes (e.g., gender and race) from the training data. However, such ad hoc solutions often fail in practice, as non-sensitive attributes may act as proxies for sensitive ones, particularly when strong correlations exist [18], [29].
Beyond the high score: Prosocial ability profiles of multi-agent populations
Tesic, Marko, Zhao, Yue, Leibo, Joel Z., Trivedi, Rakshit S., Hernandez-Orallo, Jose
The development and evaluation of social capabilities in AI agents require complex environments where competitive and cooperative behaviours naturally emerge. While game-theoretic properties can explain why certain teams or agent populations outperform others, more abstract behaviours, such as convention following, are harder to control in training and evaluation settings. The Melting Pot contest is a social AI evaluation suite designed to assess the cooperation capabilities of AI systems. In this paper, we apply a Bayesian approach known as Measurement Layouts to infer the capability profiles of multi-agent systems in the Melting Pot contest. We show that these capability profiles not only predict future performance within the Melting Pot suite but also reveal the underlying prosocial abilities of agents. Our analysis indicates that while higher prosocial capabilities sometimes correlate with better performance, this is not a universal trend-some lower-scoring agents exhibit stronger cooperation abilities. Furthermore, we find that top-performing contest submissions are more likely to achieve high scores in scenarios where prosocial capabilities are not required. These findings, together with reports that the contest winner used a hard-coded solution tailored to specific environments, suggest that at least one top-performing team may have optimised for conditions where cooperation was not necessary, potentially exploiting limitations in the evaluation framework. We provide recommendations for improving the annotation of cooperation demands and propose future research directions to account for biases introduced by different testing environments. Our results demonstrate that Measurement Layouts offer both strong predictive accuracy and actionable insights, contributing to a more transparent and generalisable approach to evaluating AI systems in complex social settings.