Uncertainty
Efficient Thompson Sampling for Online ๏ฟผMatrix-Factorization Recommendation
Jaya Kawale, Hung H. Bui, Branislav Kveton, Long Tran-Thanh, Sanjay Chawla
Matrix factorization (MF) collaborative filtering is an effective and widely used method in recommendation systems. However, the problem of finding an optimal trade-off between exploration and exploitation (otherwise known as the bandit problem), a crucial problem in collaborative filtering from cold-start, has not been previously addressed. In this paper, we present a novel algorithm for online MF recommendation that automatically combines finding the most relevant items with exploring new or less-recommended items. Our approach, called Particle Thompson sampling for MF (PTS), is based on the general Thompson sampling framework, but augmented with a novel efficient online Bayesian probabilistic matrix factorization method based on the Rao-Blackwellized particle filter. Extensive experiments in collaborative filtering using several real-world datasets demonstrate that PTS significantly outperforms the current state-of-the-arts.
Learning structured densities via infinite dimensional exponential families
Siqi Sun, Mladen Kolar, Jinbo Xu
Learning the structure of a probabilistic graphical models is a well studied problem in the machine learning community due to its importance in many applications. Current approaches are mainly focused on learning the structure under restrictive parametric assumptions, which limits the applicability of these methods. In this paper, we study the problem of estimating the structure of a probabilistic graphical model without assuming a particular parametric model. We consider probabilities that are members of an infinite dimensional exponential family [4], which is parametrized by a reproducing kernel Hilbert space (RKHS) H and its kernel k . One difficulty in learning nonparametric densities is the evaluation of the normalizing constant. In order to avoid this issue, our procedure minimizes the penalized score matching objective [10, 11]. We show how to efficiently minimize the proposed objective using existing group lasso solvers. Furthermore, we prove that our procedure recovers the graph structure with high-probability under mild conditions. Simulation studies illustrate ability of our procedure to recover the true graph structure without the knowledge of the data generating process.
Sample Efficient Path Integral Control under Uncertainty
Yunpeng Pan, Evangelos Theodorou, Michail Kontitsis
We present a data-driven optimal control framework that is derived using the path integral (PI) control approach. We find iterative control laws analytically without a priori policy parameterization based on probabilistic representation of the learned dynamics model. The proposed algorithm operates in a forward-backward manner which differentiate it from other PI-related methods that perform forward sampling to find optimal controls. Our method uses significantly less samples to find analytic control laws compared to other approaches within the PI control family that rely on extensive sampling from given dynamics models or trials on physical systems in a model-free fashion. In addition, the learned controllers can be generalized to new tasks without re-sampling based on the compositionality theory for the linearly-solvable optimal control framework. We provide experimental results on three different tasks and comparisons with state-of-the-art model-based methods to demonstrate the efficiency and generalizability of the proposed framework.