Goto

Collaborating Authors

 Bayesian Inference


Review for NeurIPS paper: Can I Trust My Fairness Metric? Assessing Fairness with Unlabeled Data and Bayesian Inference

Neural Information Processing Systems

Additional Feedback: This paper proposes to use Bayesian estimates of fairness metrics. It combines this with Bayesian calibration models (one for each protected attribute value in this particular case) in order to use unlabelled data. In light of existing work (Foulds et al 2019) on Bayesian modelling of fairness, the contribution is rather minor and is limited to the case where we have unlabelled data. The approach the authors use, as it is based on calibration, seems limited to rather specific notions of fairness where Bayesian calibration can be usefully applied. Although in l.64 the definition of calibration is correct, in l. 105-107 you write that s_j P_M(y_j 1 s_j) . Since j is a specific example, there should not be any randomness here.


Review for NeurIPS paper: Can I Trust My Fairness Metric? Assessing Fairness with Unlabeled Data and Bayesian Inference

Neural Information Processing Systems

This paper focuses on the problem of leveraging unlabelled data to generate better estimates of fairness metrics given limited labelled data. All three reviewers agree that the manuscript makes a valuable contribution and is conceptually and mathematically sound. The significance of the contribution (an auditor tool only, instead of an auditor plus a mitigation tool) is however at the low side.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

The paper describes tricks to scale Bayesian network structure learning to thousands of variables. This is achieved by developing new heuristics for candidate parent set identification and the subsequent order based structure optimization. In general, the paper is clearly written and easy to read. There are issues in editing and style, but the problems do not affect readability (much). The suggested heuristics feel bit ad-hoc, thus the value of the work is eventually judged by empirical evaluation.


Review for NeurIPS paper: Replica-Exchange Nos\'e-Hoover Dynamics for Bayesian Learning on Large Datasets

Neural Information Processing Systems

Summary and Contributions: The paper considers the problem of sampling from the posterior distribution in Bayesian inference. To be more precise, the paper approaches the question of stochastic sampling that relies only on minibatches of data at each iteration. To achieve rapid mixing between isolated modes, the authors consider parallel tempered chains and introduce replica-exchange steps into the stochastic Nose-Hoover Dynamics. The crux of this approach is the stochastic test for the replica-exchange step. To develop such a test, the authors follow the paper [An efficient minibatch acceptance test for metropolis-hastings], which introduces the concept of correction distribution.


Review for NeurIPS paper: Replica-Exchange Nos\'e-Hoover Dynamics for Bayesian Learning on Large Datasets

Neural Information Processing Systems

The paper proposes a novel MCMC-type algorithm to perform Bayesian inference on large datasets. The paper is a mixture of replica exchange, Nose-Hoover dynamics and non-standard acceptance criterion to deal with mini-batches. All the reviewers participated actively to the discussion after the rebuttal was made available. Although all the ingredients of the proposed method do exist, their combination is original and potentially useful for the ML literature as pointed out by most reviewers. Theorem 2 is also neat and proposes a nice way to propose swaps between replicas using mini-batches.


Student-t processes as infinite-width limits of posterior Bayesian neural networks

arXiv.org Machine Learning

The asymptotic properties of Bayesian Neural Networks (BNNs) have been extensively studied, particularly regarding their approximations by Gaussian processes in the infinite-width limit. We extend these results by showing that posterior BNNs can be approximated by Student-t processes, which offer greater flexibility in modeling uncertainty. Specifically, we show that, if the parameters of a BNN follow a Gaussian prior distribution, and the variance of both the last hidden layer and the Gaussian likelihood function follows an Inverse-Gamma prior distribution, then the resulting posterior BNN converges to a Student-t process in the infinite-width limit. Our proof leverages the Wasserstein metric to establish control over the convergence rate of the Student-t process approximation.


Joint State and Noise Covariance Estimation

arXiv.org Artificial Intelligence

This paper tackles the problem of jointly estimating the noise covariance matrix alongside primary parameters (such as poses and points) from measurements corrupted by Gaussian noise. In such settings, the noise covariance matrix determines the weights assigned to individual measurements in the least squares problem. We show that the joint problem exhibits a convex structure and provide a full characterization of the optimal noise covariance estimate (with analytical solutions) within joint maximum a posteriori and likelihood frameworks and several variants. Leveraging this theoretical result, we propose two novel algorithms that jointly estimate the primary parameters and the noise covariance matrix. To validate our approach, we conduct extensive experiments across diverse scenarios and offer practical insights into their application in robotics and computer vision estimation problems with a particular focus on SLAM.


LeAP: Consistent multi-domain 3D labeling using Foundation Models

arXiv.org Artificial Intelligence

Availability of datasets is a strong driver for research on 3D semantic understanding, and whilst obtaining unlabeled 3D point cloud data is straightforward, manually annotating this data with semantic labels is time-consuming and costly. Recently, Vision Foundation Models (VFMs) enable open-set semantic segmentation on camera images, potentially aiding automatic labeling. However,VFMs for 3D data have been limited to adaptations of 2D models, which can introduce inconsistencies to 3D labels. This work introduces Label Any Pointcloud (LeAP), leveraging 2D VFMs to automatically label 3D data with any set of classes in any kind of application whilst ensuring label consistency. Using a Bayesian update, point labels are combined into voxels to improve spatio-temporal consistency. A novel 3D Consistency Network (3D-CN) exploits 3D information to further improve label quality. Through various experiments, we show that our method can generate high-quality 3D semantic labels across diverse fields without any manual labeling. Further, models adapted to new domains using our labels show up to a 34.2 mIoU increase in semantic segmentation tasks.


Distribution learning via neural differential equations: minimal energy regularization and approximation theory

arXiv.org Machine Learning

Neural ordinary differential equations (ODEs) provide expressive representations of invertible transport maps that can be used to approximate complex probability distributions, e.g., for generative modeling, density estimation, and Bayesian inference. We show that for a large class of transport maps $T$, there exists a time-dependent ODE velocity field realizing a straight-line interpolation $(1-t)x + tT(x)$, $t \in [0,1]$, of the displacement induced by the map. Moreover, we show that such velocity fields are minimizers of a training objective containing a specific minimum-energy regularization. We then derive explicit upper bounds for the $C^k$ norm of the velocity field that are polynomial in the $C^k$ norm of the corresponding transport map $T$; in the case of triangular (Knothe--Rosenblatt) maps, we also show that these bounds are polynomial in the $C^k$ norms of the associated source and target densities. Combining these results with stability arguments for distribution approximation via ODEs, we show that Wasserstein or Kullback--Leibler approximation of the target distribution to any desired accuracy $\epsilon > 0$ can be achieved by a deep neural network representation of the velocity field whose size is bounded explicitly in terms of $\epsilon$, the dimension, and the smoothness of the source and target densities. The same neural network ansatz yields guarantees on the value of the regularized training objective.


Temporal Distribution Shift in Real-World Pharmaceutical Data: Implications for Uncertainty Quantification in QSAR Models

arXiv.org Artificial Intelligence

The estimation of uncertainties associated with predictions from quantitative structure-activity relationship (QSAR) models can accelerate the drug discovery process by identifying promising experiments and allowing an efficient allocation of resources. Several computational tools exist that estimate the predictive uncertainty in machine learning models. However, deviations from the i.i.d. setting have been shown to impair the performance of these uncertainty quantification methods. We use a real-world pharmaceutical dataset to address the pressing need for a comprehensive, large-scale evaluation of uncertainty estimation methods in the context of realistic distribution shifts over time. We investigate the performance of several uncertainty estimation methods, including ensemble-based and Bayesian approaches. Furthermore, we use this real-world setting to systematically assess the distribution shifts in label and descriptor space and their impact on the capability of the uncertainty estimation methods. Our study reveals significant shifts over time in both label and descriptor space and a clear connection between the magnitude of the shift and the nature of the assay. Moreover, we show that pronounced distribution shifts impair the performance of popular uncertainty estimation methods used in QSAR models. This work highlights the challenges of identifying uncertainty quantification methods that remain reliable under distribution shifts introduced by real-world data.