to

### A Comprehensive Introduction to Bayesian Deep Learning

"The key distinguishing property of a Bayesian approach is marginalization instead of optimization, where we represent solutions given by all settings of parameters weighted by their posterior probabilities, rather than bet everything on a single setting of parameters." The time is ripe to dig into marginalization vs optimization, and broaden our general understanding of the Bayesian approach.

### A rigorous introduction for linear models

This note is meant to provide an introduction to linear models and the theories behind them. Our goal is to give a rigorous introduction to the readers with prior exposure to ordinary least squares. In machine learning, the output is usually a nonlinear function of the input. Deep learning even aims to find a nonlinear dependence with many layers which require a large amount of computation. However, most of these algorithms build upon simple linear models. We then describe linear models from different views and find the properties and theories behind the models. The linear model is the main technique in regression problems and the primary tool for it is the least squares approximation which minimizes a sum of squared errors. This is a natural choice when we're interested in finding the regression function which minimizes the corresponding expected squared error. We first describe ordinary least squares from three different points of view upon which we disturb the model with random noise and Gaussian noise. By Gaussian noise, the model gives rise to the likelihood so that we introduce a maximum likelihood estimator. It also develops some distribution theories for it via this Gaussian disturbance. The distribution theory of least squares will help us answer various questions and introduce related applications. We then prove least squares is the best unbiased linear model in the sense of mean squared error and most importantly, it actually approaches the theoretical limit. We end up with linear models with the Bayesian approach and beyond.

### A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control

In this paper, we present a state-of-the-art reinforcement learning method for autonomous driving. Our approach employs temporal difference learning in a Bayesian framework to learn vehicle control signals from sensor data. The agent has access to images from a forward facing camera, which are preprocessed to generate semantic segmentation maps. We trained our system using both ground truth and estimated semantic segmentation input. Based on our observations from a large set of experiments, we conclude that training the system on ground truth input data leads to better performance than training the system on estimated input even if estimated input is used for evaluation. The system is trained and evaluated in a realistic simulated urban environment using the CARLA simulator. The simulator also contains a benchmark that allows for comparing to other systems and methods. The required training time of the system is shown to be lower and the performance on the benchmark superior to competing approaches.

### Dealing with Overconfidence in Neural Networks: Bayesian Approach

I trained a classifier on images of animals and gave it an image of myself, it's 98% confident I'm a dog. This is an exploration of a possible Bayesian fix. I trained a multi-class classifier on images of cats, dogs and wild animals and passed an image of myself, it's 98% confident I'm a dog. The problem isn't that I passed an inappropriate image because models in the real world are passed all sorts of garbage. It's that the model is overconfident about an image far away from the training data.

### A Comprehensive Introduction to Bayesian Deep Learning

"The key distinguishing property of a Bayesian approach is marginalization instead of optimization, where we represent solutions given by all settings of parameters weighted by their posterior probabilities, rather than bet everything on a single setting of parameters." The time is ripe to dig into marginalization vs optimization, and broaden our general understanding of the Bayesian approach.

### Interpreting Deep Learning Models for Epileptic Seizure Detection on EEG signals

While Deep Learning (DL) is often considered the state-of-the art for Artificial Intelligence-based medical decision support, it remains sparsely implemented in clinical practice and poorly trusted by clinicians due to insufficient interpretability of neural network models. We have tackled this issue by developing interpretable DL models in the context of online detection of epileptic seizure, based on EEG signal. This has conditioned the preparation of the input signals, the network architecture, and the post-processing of the output in line with the domain knowledge. Specifically, we focused the discussion on three main aspects: 1) how to aggregate the classification results on signal segments provided by the DL model into a larger time scale, at the seizure-level; 2) what are the relevant frequency patterns learned in the first convolutional layer of different models, and their relation with the delta, theta, alpha, beta and gamma frequency bands on which the visual interpretation of EEG is based; and 3) the identification of the signal waveforms with larger contribution towards the ictal class, according to the activation differences highlighted using the DeepLIFT method. Results show that the kernel size in the first layer determines the interpretability of the extracted features and the sensitivity of the trained models, even though the final performance is very similar after post-processing. Also, we found that amplitude is the main feature leading to an ictal prediction, suggesting that a larger patient population would be required to learn more complex frequency patterns. Still, our methodology was successfully able to generalize patient inter-variability for the majority of the studied population with a classification F1-score of 0.873 and detecting 90% of the seizures.

### Kalman Filtering: An Intuitive Guide Based on Bayesian Approach

This year celebrates the 50th anniversary of the paper by Rudolf E. Kálmán that conferred upon the world, the remarkable idea of a Kalman Filter. In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, producing estimates of unknown variables that tend to be more accurate than those based on a single measurement alone. This is achieved by estimating a joint probability distribution over the variables for each timeframe. The Kalman filter is ideally applied to understand the behaviour of systems that change or evolve over time. It is useful in situations where we might have uncertain information (i.e.

### Towards Scalable Bayesian Learning of Causal DAGs

We give methods for Bayesian inference of directed acyclic graphs, DAGs, and the induced causal effects from passively observed complete data. Our methods build on a recent Markov chain Monte Carlo scheme for learning Bayesian networks, which enables efficient approximate sampling from the graph posterior, provided that each node is assigned a small number K of candidate parents. We present algorithmic tricks to significantly reduce the space and time requirements of the method, making it feasible to use substantially larger values of K. Furthermore, we investigate the problem of selecting the candidate parents per node so as to maximize the covered posterior mass. Finally, we combine our sampling method with a novel Bayesian approach for estimating causal effects in linear Gaussian DAG models. Numerical experiments demonstrate the performance of our methods in detecting ancestor-descendant relations, and in effect estimation our Bayesian method is shown to outperform existing approaches.

### Why you should try the Bayesian approach of A/B testing

"Critical thinking is an active and ongoing process. It requires that we all think like Bayesians, updating our knowledge as new information comes in." ― Daniel J. Levitin, A Field Guide to Lies: Critical Thinking in the Information Age Before we delve into the intuition behind using the Bayesian approach of estimation, we need to understand a few concepts. Inferential statistics is when you infer something about a whole population based on a sample of that population, as opposed to descriptive statistics which describes something about the whole population. When it comes to inferential statistics, there are two main philosophies: frequentist inference and Bayesian inference. The frequentist approach is known to be the more traditional approach to statistical inference, and thus studied more in most statistics courses (especially introductory courses). However, many would argue that the Bayesian approach is much closer to the way humans naturally perceive probability.

### A Bayesian Approach with Type-2 Student-tMembership Function for T-S Model Identification

Clustering techniques have been proved highly suc-cessful for Takagi-Sugeno (T-S) fuzzy model identification. Inparticular, fuzzyc-regression clustering based on type-2 fuzzyset has been shown the remarkable results on non-sparse databut their performance degraded on sparse data. In this paper, aninnovative architecture for fuzzyc-regression model is presentedand a novel student-tdistribution based membership functionis designed for sparse data modelling. To avoid the overfitting,we have adopted a Bayesian approach for incorporating aGaussian prior on the regression coefficients. Additional noveltyof our approach lies in type-reduction where the final output iscomputed using Karnik Mendel algorithm and the consequentparameters of the model are optimized using Stochastic GradientDescent method. As detailed experimentation, the result showsthat proposed approach outperforms on standard datasets incomparison of various state-of-the-art methods.