forward method
Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
We study sampling as optimization in the space of measures. We focus on gradient flow-based optimization with the Langevin dynamics as a case study. We investigate the source of the bias of the unadjusted Langevin algorithm (ULA) in discrete time, and consider how to remove or reduce the bias. We point out the difficulty is that the heat flow is exactly solvable, but neither its forward nor backward method is implementable in general, except for Gaussian data. We propose the symmetrized Langevin algorithm (SLA), which should have a smaller bias than ULA, at the price of implementing a proximal gradient step in space. We show SLA is in fact consistent for Gaussian target measure, whereas ULA is not. We also illustrate various algorithms explicitly for Gaussian target measure, including gradient descent, proximal gradient, and Forward-Backward, and show they are all consistent.
Yet another introduction to Neural Networks
In this notebook, I will explain how to implement a neural network from scratch and use the version of MNIST dataset that is provided within Scikit-Learn for testing. I will specificallty illustrate the use of Python classes to define layers in the network as objects. Each layer object has forward and backward propagation methods which leads to compact, easily readable code. In writing this tutorial, I've had inspiration from Peter Roelants' page. After loading the data, we divide it into three parts, training, validation and testing sets.
Data Piques Matrix Factorization in PyTorch
Hey, remember when I wrote those ungodly long posts about matrix factorization chock-full of gory math? You can forget it all. We have now entered the Era of Deep Learning, and automatic differentiation shall be our guiding light. Less facetiously, I have finally spent some time checking out these new-fangled deep learning frameworks, and damn if I am not excited. In this post, I will show you how to use PyTorch to bypass the mess of code from my old post on Explicit Matrix Factorization and instead implement a model that will converge faster in fewer lines of code.