Bayesian Learning
Training diffusion models with reinforcement learning
Diffusion models have recently emerged as the de facto standard for generating complex, high-dimensional outputs. You may know them for their ability to produce stunning AI art and hyper-realistic synthetic images, but they have also found success in other applications such as drug design and continuous control. The key idea behind diffusion models is to iteratively transform random noise into a sample, such as an image or protein structure. This is typically motivated as a maximum likelihood estimation problem, where the model is trained to generate samples that match the training data as closely as possible. However, most use cases of diffusion models are not directly concerned with matching the training data, but instead with a downstream objective.
A Real-time Faint Space Debris Detector With Learning-based LCM
Lu, Zherui, Wang, Gangyi, Wei, Xinguo, Li, Jian
With the development of aerospace technology, the increasing population of space debris has posed a great threat to the safety of spacecraft. However, the low intensity of reflected light and high angular velocity of space debris impede the extraction. Besides, due to the limitations of the ground observation methods, small space debris can hardly be detected, making it necessary to enhance the spacecraft's capacity for space situational awareness (SSA). Considering that traditional methods have some defects in low-SNR target detection, such as low effectiveness and large time consumption, this paper proposes a method for low-SNR streak extraction based on local contrast and maximum likelihood estimation (MLE), which can detect space objects with SNR 2.0 efficiently. In the proposed algorithm, local contrast will be applied for crude classifications, which will return connected components as preliminary results, and then MLE will be performed to reconstruct the connected components of targets via orientated growth, further improving the precision. The algorithm has been verified with both simulated streaks and real star tracker images, and the average centroid error of the proposed algorithm is close to the state-of-the-art method like ODCC. At the same time, the algorithm in this paper has significant advantages in efficiency compared with ODCC. In conclusion, the algorithm in this paper is of high speed and precision, which guarantees its promising applications in the extraction of high dynamic targets.
A Bayesian Approach to Robust Inverse Reinforcement Learning
Wei, Ran, Zeng, Siliang, Li, Chenliang, Garcia, Alfredo, McDonald, Anthony, Hong, Mingyi
Inverse reinforcement learning (IRL) is the problem of extracting the reward function and policy of a value-maximizing agent from its behavior [1, 2]. IRL is an important tool in domains where manually specifying reward functions or policies is difficult, such as in autonomous driving [3], or when the extracted reward function can reveal novel insights about a target population and be used to device interventions, such as in biology, economics, and human-robot interaction studies [4, 5, 6]. However, wider applications of IRL face two interrelated algorithmic challenges: 1) having access to the target deployment environment or an accurate simulator thereof and 2) robustness of the learned policy and reward function due to the covariate shift between the training and deployment environments or datasets [7, 8, 9]. In this paper, we focus on model-based offline IRL to address challenge 1). A notable class of model-based offline IRL methods estimate the dynamics and reward in a two-stage fashion (see Figure 1) [10, 11, 12, 13]. In the first stage, a Figure 1: Objectives of the traditional two-stage dynamics model is estimated from the offline IRL and the proposed simultaneous estimation approach of Bayesian model-based IRL.
Estimation of Counterfactual Interventions under Uncertainties
Weilbach, Juliane, Gerwinn, Sebastian, Kandemir, Melih, Fraenzle, Martin
Counterfactual analysis is intuitively performed by humans on a daily basis eg. "What should I have done differently to get the loan approved?". Such counterfactual questions also steer the formulation of scientific hypotheses. More formally it provides insights about potential improvements of a system by inferring the effects of hypothetical interventions into a past observation of the system's behaviour which plays a prominent role in a variety of industrial applications. Due to the hypothetical nature of such analysis, counterfactual distributions are inherently ambiguous. This ambiguity is particularly challenging in continuous settings in which a continuum of explanations exist for the same observation. In this paper, we address this problem by following a hierarchical Bayesian approach which explicitly models such uncertainty. In particular, we derive counterfactual distributions for a Bayesian Warped Gaussian Process thereby allowing for non-Gaussian distributions and non-additive noise. We illustrate the properties our approach on a synthetic and on a semi-synthetic example and show its performance when used within an algorithmic recourse downstream task.
Predictive change point detection for heterogeneous data
Glock, Anna-Christina, Sobieczky, Florian, Fรผrnkranz, Johannes, Filzmoser, Peter, Jech, Martin
A change point detection (CPD) framework assisted by a predictive machine learning model called "Predict and Compare" is introduced and characterised in relation to other state-of-the-art online CPD routines which it outperforms in terms of false positive rate and out-of-control average run length. The method's focus is on improving standard methods from sequential analysis such as the CUSUM rule in terms of these quality measures. This is achieved by replacing typically used trend estimation functionals such as the running mean with more sophisticated predictive models (Predict step), and comparing their prognosis with actual data (Compare step). The two models used in the Predict step are the ARIMA model and the LSTM recursive neural network. However, the framework is formulated in general terms, so as to allow the use of other prediction or comparison methods than those tested here. The power of the method is demonstrated in a tribological case study in which change points separating the run-in, steady-state, and divergent wear phases are detected in the regime of very few false positives.
A Bayesian approach to breaking things: efficiently predicting and repairing failure modes via sampling
From power grids to transportation and logistics systems, autonomous systems play a central, and often safety-critical, role in modern life. Even as these systems grow more complex and ubiquitous, we have already observed failures in autonomous systems like autonomous vehicles and power networks resulting in the loss of human life [1]. Given this context, it is important that we be able to verify the safety of autonomous systems prior to deployment; for instance, by understanding the different ways in which a system might fail and proposing repair strategies. Human designers often use their knowledge of likely failure modes to guide the design process; indeed, systematically assessing the risks of different failures and developing repair strategies is an important part of the systems engineering process [2]. However, as autonomous systems grow more complex, it becomes increasingly difficult for human engineers to manually predict likely failures. In this paper, we propose an automated framework for predicting, and then repairing, failure modes in complex autonomous systems. Our effort builds on a large body of work on testing and verification of autonomous systems, many of which focus on identifying failure modes or adversarial examples [3, 4, 5, 6, 7, 8], but we identify two major gaps in the state of the art. First, many existing methods [4, 5, 9, 7] use techniques like gradient descent to search locally for failure modes; however, in practice we are more interested in characterizing the distribution of potential failures, which requires a global perspective. Some methods exist that address this issue by taking a probabilistic approach to sample from an (unknown) distribution of failure modes [6, 10].
Heuristic Satisficing Inferential Decision Making in Human and Robot Active Perception
Chen, Yucheng, Zhu, Pingping, Alers, Anthony, Egner, Tobias, Sommer, Marc A., Ferrari, Silvia
Inferential decision-making algorithms typically assume that an underlying probabilistic model of decision alternatives and outcomes may be learned a priori or online. Furthermore, when applied to robots in real-world settings they often perform unsatisfactorily or fail to accomplish the necessary tasks because this assumption is violated and/or they experience unanticipated external pressures and constraints. Cognitive studies presented in this and other papers show that humans cope with complex and unknown settings by modulating between near-optimal and satisficing solutions, including heuristics, by leveraging information value of available environmental cues that are possibly redundant. Using the benchmark inferential decision problem known as ``treasure hunt", this paper develops a general approach for investigating and modeling active perception solutions under pressure. By simulating treasure hunt problems in virtual worlds, our approach learns generalizable strategies from high performers that, when applied to robots, allow them to modulate between optimal and heuristic solutions on the basis of external pressures and probabilistic models, if and when available. The result is a suite of active perception algorithms for camera-equipped robots that outperform treasure-hunt solutions obtained via cell decomposition, information roadmap, and information potential algorithms, in both high-fidelity numerical simulations and physical experiments. The effectiveness of the new active perception strategies is demonstrated under a broad range of unanticipated conditions that cause existing algorithms to fail to complete the search for treasures, such as unmodelled time constraints, resource constraints, and adverse weather (fog).
Improved Auto-Encoding using Deterministic Projected Belief Networks
In this paper, we exploit the unique properties of a deterministic projected belief network (D-PBN) to take full advantage of trainable compound activation functions (TCAs). A D-PBN is a type of auto-encoder that operates by "backing up" through a feed-forward neural network. TCAs are activation functions with complex monotonic-increasing shapes that change the distribution of the data so that the linear transformation that follows is more effective. Because a D-PBN operates by "backing up", the TCAs are inverted in the reconstruction process, restoring the original distribution of the data, thus taking advantage of a given TCA in both analysis and reconstruction. In this paper, we show that a D-PBN auto-encoder with TCAs can significantly out-perform standard auto-encoders including variational auto-encoders.
Exploiting Noise as a Resource for Computation and Learning in Spiking Neural Networks
Ma, Gehua, Yan, Rui, Tang, Huajin
$\textbf{Formal version available at}$ https://cell.com/patterns/fulltext/S2666-3899(23)00200-3 Networks of spiking neurons underpin the extraordinary information-processing capabilities of the brain and have become pillar models in neuromorphic artificial intelligence. Despite extensive research on spiking neural networks (SNNs), most studies are established on deterministic models, overlooking the inherent non-deterministic, noisy nature of neural computations. This study introduces the noisy spiking neural network (NSNN) and the noise-driven learning rule (NDL) by incorporating noisy neuronal dynamics to exploit the computational advantages of noisy neural processing. NSNN provides a theoretical framework that yields scalable, flexible, and reliable computation. We demonstrate that NSNN leads to spiking neural models with competitive performance, improved robustness against challenging perturbations than deterministic SNNs, and better reproducing probabilistic computations in neural coding. This study offers a powerful and easy-to-use tool for machine learning, neuromorphic intelligence practitioners, and computational neuroscience researchers.
Learning nonparametric DAGs with incremental information via high-order HSIC
Score-based methods for learning Bayesain networks(BN) aim to maximizing the global score functions. However, if local variables have direct and indirect dependence simultaneously, the global optimization on score functions misses edges between variables with indirect dependent relationship, of which scores are smaller than those with direct dependent relationship. In this paper, we present an identifiability condition based on a determined subset of parents to identify the underlying DAG. By the identifiability condition, we develop a two-phase algorithm namely optimal-tuning (OT) algorithm to locally amend the global optimization. In the optimal phase, an optimization problem based on first-order Hilbert-Schmidt independence criterion (HSIC) gives an estimated skeleton as the initial determined parents subset. In the tuning phase, the skeleton is locally tuned by deletion, addition and DAG-formalization strategies using the theoretically proved incremental properties of high-order HSIC. Numerical experiments for different synthetic datasets and real-world datasets show that the OT algorithm outperforms existing methods. Especially in Sigmoid Mix model with the size of the graph being ${\rm\bf d=40}$, the structure intervention distance (SID) of the OT algorithm is 329.7 smaller than the one obtained by CAM, which indicates that the graph estimated by the OT algorithm misses fewer edges compared with CAM.Source code of the OT algorithm is available at https://github.com/YafeiannWang/optimal-tune-algorithm.