Bayesian Learning
Probabilistic Phase Labeling and Lattice Refinement for Autonomous Material Research
Chang, Ming-Chiang, Ament, Sebastian, Amsler, Maximilian, Sutherland, Duncan R., Zhou, Lan, Gregoire, John M., Gomes, Carla P., van Dover, R. Bruce, Thompson, Michael O.
X-ray diffraction (XRD) is an essential technique to determine a material's crystal structure in high-throughput experimentation, and has recently been incorporated in artificially intelligent agents in autonomous scientific discovery processes. However, rapid, automated and reliable analysis method of XRD data matching the incoming data rate remains a major challenge. To address these issues, we present CrystalShift, an efficient algorithm for probabilistic XRD phase labeling that employs symmetry-constrained pseudo-refinement optimization, best-first tree search, and Bayesian model comparison to estimate probabilities for phase combinations without requiring phase space information or training. We demonstrate that CrystalShift provides robust probability estimates, outperforming existing methods on synthetic and experimental datasets, and can be readily integrated into high-throughput experimental workflows. In addition to efficient phase-mapping, CrystalShift offers quantitative insights into materials' structural parameters, which facilitate both expert evaluation and AI-based modeling of the phase space, ultimately accelerating materials identification and discovery.
Brain-Inspired Computational Intelligence via Predictive Coding
Salvatori, Tommaso, Mali, Ankur, Buckley, Christopher L., Lukasiewicz, Thomas, Rao, Rajesh P. N., Friston, Karl, Ororbia, Alexander
Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large.
Adaptive Noise Covariance Estimation under Colored Noise using Dynamic Expectation Maximization
Meera, Ajith Anil, Lanillos, Pablo
A wide variety of NCM estimation methods have been proposed within the control community [1]. These methods Identifying the noise associated with a process, i.e., estimating can be classified into two categories: i) feedback free the Noise Covariance Matrix (NCM) is crucial for methods where the estimation is done by processing the state estimation and control of a dynamic system [1]. An entire data sequence offline and ii) feedback methods where incorrect NCM results in suboptimal gains (e.g., Kalman estimation is done online and. The feedback free methods are gain), significantly decreasing the quality of state estimation of two types: i) the correlation methods that are based on the and tracking. Hence, accurate NCM estimation has a wide analysis of the measurement error sequence, such as Indirect scope of applications that include robotics, signal processing, Correlation (ICM) [9], Input-Output Correlation (IOCM) fault detection, optimal controller design, system identification, [10], Weighted Correlation (WCM) [11], Measurement Average etc. However, most of the NCM estimation algorithms Correlation (MACM) [12], Direct Correlation (DCM) assume a white noise condition, which may not be true in [13] and Measurement Difference Correlation (MDCM) [14], practice. In many real-world applications the noise is colored and ii) the Maximum-Likelihood Methods (MLM) [15] that (e.g., there are temporal autocorrelations). This makes NCM maximises the likelihood function over the data.
Bayesian Hyperbolic Multidimensional Scaling
Liu, Bolun, Lubold, Shane, Raftery, Adrian E., McCormick, Tyler H.
Multidimensional scaling (MDS) is a widely used approach to representing high-dimensional, dependent data. MDS works by assigning each observation a location on a low-dimensional geometric manifold, with distance on the manifold representing similarity. We propose a Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyperbolic. Using hyperbolic space facilitates representing tree-like structures common in many settings (e.g. text or genetic data with hierarchical structure). A Bayesian approach provides regularization that minimizes the impact of measurement error in the observed data and assesses uncertainty. We also propose a case-control likelihood approximation that allows for efficient sampling from the posterior distribution in larger data settings, reducing computational complexity from approximately $O(n^2)$ to $O(n)$. We evaluate the proposed method against state-of-the-art alternatives using simulations, canonical reference datasets, Indian village network data, and human gene expression data.
OCDaf: Ordered Causal Discovery with Autoregressive Flows
Kamkari, Hamidreza, Zehtab, Vahid, Balazadeh, Vahid, Krishnan, Rahul G.
We propose OCDaf, a novel order-based method for learning causal graphs from observational data. We establish the identifiability of causal graphs within multivariate heteroscedastic noise models, a generalization of additive noise models that allow for non-constant noise variances. Drawing upon the structural similarities between these models and affine autoregressive normalizing flows, we introduce a continuous search algorithm to find causal structures. Our experiments demonstrate state-of-the-art performance across the Sachs and SynTReN benchmarks in Structural Hamming Distance (SHD) and Structural Intervention Distance (SID). Furthermore, we validate our identifiability theory across various parametric and nonparametric synthetic datasets and showcase superior performance compared to existing baselines.
Spectral Ranking Inferences based on General Multiway Comparisons
Fan, Jianqing, Lou, Zhipeng, Wang, Weichen, Yu, Mengxin
This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a very general and more realistic setup in which the comparison graph consists of hyper-edges of possible heterogeneous sizes and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in real applications, circumventing the need to specify the graph randomness and the restrictive homogeneous sampling assumption imposed in the commonly-used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models. Furthermore, in the scenarios when the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE). We discover that a two-step spectral method, where we apply the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Given the asymptotic distributions of the estimated preference scores, we also introduce a comprehensive framework to carry out both one-sample and two-sample ranking inferences, applicable to both fixed and random graph settings. It is noteworthy that it is the first time effective two-sample rank testing methods are proposed. Finally, we substantiate our findings via comprehensive numerical simulations and subsequently apply our developed methodologies to perform statistical inferences on statistics journals and movie rankings.
Value-Distributional Model-Based Reinforcement Learning
Luis, Carlos E., Bottero, Alessandro G., Vinogradska, Julia, Berkenkamp, Felix, Peters, Jan
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks. We study the problem from a model-based Bayesian reinforcement learning perspective, where the goal is to learn the posterior distribution over value functions induced by parameter (epistemic) uncertainty of the Markov decision process. Previous work restricts the analysis to a few moments of the distribution over values or imposes a particular distribution shape, e.g., Gaussians. Inspired by distributional reinforcement learning, we introduce a Bellman operator whose fixed-point is the value distribution function. Based on our theory, we propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function that can be used for policy optimization. Evaluation across several continuous-control tasks shows performance benefits with respect to established model-based and model-free algorithms.
Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme
Huang, Jianhao, Li, Dongxu, Huang, Chuan, Qin, Xiaoqi, Zhang, Wei
Semantic communications are expected to accomplish various semantic tasks with relatively less spectrum resource by exploiting the semantic feature of source data. To simultaneously serve both the data transmission and semantic tasks, joint data compression and semantic analysis has become pivotal issue in semantic communications. This paper proposes a deep separate source-channel coding (DSSCC) framework for the joint task and data oriented semantic communications (JTD-SC) and utilizes the variational autoencoder approach to solve the rate-distortion problem with semantic distortion. First, by analyzing the Bayesian model of the DSSCC framework, we derive a novel rate-distortion optimization problem via the Bayesian inference approach for general data distributions and semantic tasks. Next, for a typical application of joint image transmission and classification, we combine the variational autoencoder approach with a forward adaption scheme to effectively extract image features and adaptively learn the density information of the obtained features. Finally, an iterative training algorithm is proposed to tackle the overfitting issue of deep learning models. Simulation results reveal that the proposed scheme achieves better coding gain as well as data recovery and classification performance in most scenarios, compared to the classical compression schemes and the emerging deep joint source-channel schemes.
Shadow Datasets, New challenging datasets for Causal Representation Learning
Zhu, Jiageng, Xie, Hanchen, Wu, Jianhua, Li, Jiazhi, Khayatkhoei, Mahyar, Hussein, Mohamed E., AbdAlmageed, Wael
Discovering causal relations among semantic factors is an emergent topic in representation learning. Most causal representation learning (CRL) methods are fully supervised, which is impractical due to costly labeling. To resolve this restriction, weakly supervised CRL methods were introduced. To evaluate CRL performance, four existing datasets, Pendulum, Flow, CelebA(BEARD) and CelebA(SMILE), are utilized. However, existing CRL datasets are limited to simple graphs with few generative factors. Thus we propose two new datasets with a larger number of diverse generative factors and more sophisticated causal graphs. In addition, current real datasets, CelebA(BEARD) and CelebA(SMILE), the originally proposed causal graphs are not aligned with the dataset distributions. Thus, we propose modifications to them.
Scalable method for Bayesian experimental design without integrating over posterior distribution
Hoang, Vinh, Espath, Luis, Krumscheid, Sebastian, Tempone, Raúl
We address the computational efficiency in solving the A-optimal Bayesian design of experiments problems for which the observational map is based on partial differential equations and, consequently, is computationally expensive to evaluate. A-optimality is a widely used and easy-to-interpret criterion for Bayesian experimental design. This criterion seeks the optimal experimental design by minimizing the expected conditional variance, which is also known as the expected posterior variance. This study presents a novel likelihood-free approach to the A-optimal experimental design that does not require sampling or integrating the Bayesian posterior distribution. The expected conditional variance is obtained via the variance of the conditional expectation using the law of total variance, and we take advantage of the orthogonal projection property to approximate the conditional expectation. We derive an asymptotic error estimation for the proposed estimator of the expected conditional variance and show that the intractability of the posterior distribution does not affect the performance of our approach. We use an artificial neural network (ANN) to approximate the nonlinear conditional expectation in the implementation of our method. We then extend our approach for dealing with the case that the domain of experimental design parameters is continuous by integrating the training process of the ANN into minimizing the expected conditional variance. Through numerical experiments, we demonstrate that our method greatly reduces the number of observation model evaluations compared with widely used importance sampling-based approaches. This reduction is crucial, considering the high computational cost of the observational models. Code is available at https://github.com/vinh-tr-hoang/DOEviaPACE.