Duvenaud, David, Lloyd, James Robert, Grosse, Roger, Tenenbaum, Joshua B., Ghahramani, Zoubin

Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base kernels. We present a method for searching over this space of structures which mirrors the scientific discovery process. The learned structures can often decompose functions into interpretable components and enable long-range extrapolation on time-series datasets. Our structure search method outperforms many widely used kernels and kernel combination methods on a variety of prediction tasks.

Schaechtle, Ulrich, Saad, Feras, Radul, Alexey, Mansinghka, Vikash

There is a widespread need for techniques that can discover structure from time series data. Recently introduced techniques such as Automatic Bayesian Covariance Discovery (ABCD) provide a way to find structure within a single time series by searching through a space of covariance kernels that is generated using a simple grammar. While ABCD can identify a broad class of temporal patterns, it is difficult to extend and can be brittle in practice. This paper shows how to extend ABCD by formulating it in terms of probabilistic program synthesis. The key technical ideas are to (i) represent models using abstract syntax trees for a domain-specific probabilistic language, and (ii) represent the time series model prior, likelihood, and search strategy using probabilistic programs in a sufficiently expressive language. The final probabilistic program is written in under 70 lines of probabilistic code in Venture. The paper demonstrates an application to time series clustering that involves a non-parametric extension to ABCD, experiments for interpolation and extrapolation on real-world econometric data, and improvements in accuracy over both non-parametric and standard regression baselines.

Automating statistical modelling is a challenging problem in artificial intelligence. The Automatic Statistician takes a first step in this direction, by employing a kernel search algorithm with Gaussian Processes (GP) to provide interpretable statistical models for regression problems. However this does not scale due to its $O(N^3)$ running time for the model selection. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm that extends the Automatic Statistician to bigger data sets. In doing so, we derive a cheap upper bound on the GP marginal likelihood that sandwiches the marginal likelihood with the variational lower bound . We show that the upper bound is significantly tighter than the lower bound and thus useful for model selection.

Wong, Raymond K. (University of New South Wales) | Chu, Victor (University of New South Wales) | Ghanavati, Mojgan (University of New South Wales) | Hamzehei, Asso (University of New South Wales)

Causal structure discovery methods are investigated recently but none of them has taken possible time-varying structure into consideration. This paper uses a notion of causal time-varying dynamic Bayesian network (CTV-DBN) and define a causal boundary to govern cross-time information sharing. CTV-DBN is constructed by using asymmetric kernels to address sample scarcity and to adhere to causal principles; while maintaining good variance and bias trade-off. Upon satisfying causal Markov assumption, causal inference can be made based on manipulation rule. We explore trajectory data collected from taxis in Beijing which exhibit heterogeneous patterns, data sparseness and distribution skewness. Experiments show that by using casual structures and trajectory clustering, we can analyse the spatio-temporal behavior of the trajectory data.

Probabilistic programming languages represent complex data with intermingled models in a few lines of code. Efficient inference algorithms in probabilistic programming languages make possible to build unified frameworks to compute interesting probabilities of various large, real-world problems. When the structure of model is given, constructing a probabilistic program is rather straightforward. Thus, main focus have been to learn the best model parameters and compute marginal probabilities. In this paper, we provide a new perspective to build expressive probabilistic program from continue time series data when the structure of model is not given. The intuition behind of our method is to find a descriptive covariance structure of time series data in nonparametric Gaussian process regression. We report that such descriptive covariance structure efficiently derives a probabilistic programming description accurately.