### [P] Gaussian Process Regression tutorial • r/MachineLearning

Hi, I am making a tutorial/jupyter notebook on Gaussian Process Regression. I just finished part 1. My intention is to help people understand the algorithm without getting to deep into the mathematics / bayesian statistics. It gives an introduction and lists some pros/cons for GPRegression. Then presents the math of gaussian processes and teaches how to sample a gaussian process in numpy.

### Explaining Missing Heritability Using Gaussian Process Regression

For many traits and common human diseases, causal loci uncovered by genetic association studies account for little of the known heritable variation. We propose a Bayesian non-parametric Gaussian Process Regression model, for identifying associated loci in the presence of interactions of arbitrary order. We analysed 46 quantitative yeast phenotypes and found that over 70% of the total known missing heritability could be explained using common genetic variants, many without significant marginal effects. Additional analysis of an immunological rat phenotype identified a three SNP interaction model providing a significantly better fit (p-value 9.0e-11) than the null model incorporating only the single marginally significant SNP. This new approach, called GPMM, represents a significant advance in approaches to understanding the missing heritability problem with potentially important implications for studies of complex, quantitative traits.

### Stochastic Variational Inference for Fully Bayesian Sparse Gaussian Process Regression Models

This paper presents a novel variational inference framework for deriving a family of Bayesian sparse Gaussian process regression (SGPR) models whose approximations are variationally optimal with respect to the full-rank GPR model enriched with various corresponding correlation structures of the observation noises. Our variational Bayesian SGPR (VBSGPR) models jointly treat both the distributions of the inducing variables and hyperparameters as variational parameters, which enables the decomposability of the variational lower bound that in turn can be exploited for stochastic optimization. Such a stochastic optimization involves iteratively following the stochastic gradient of the variational lower bound to improve its estimates of the optimal variational distributions of the inducing variables and hyperparameters (and hence the predictive distribution) of our VBSGPR models and is guaranteed to achieve asymptotic convergence to them. We show that the stochastic gradient is an unbiased estimator of the exact gradient and can be computed in constant time per iteration, hence achieving scalability to big data. We empirically evaluate the performance of our proposed framework on two real-world, massive datasets.