Goto

Collaborating Authors

 nakkiran


Composition and Control with Distilled Energy Diffusion Models and Sequential Monte Carlo

arXiv.org Machine Learning

Diffusion models may be formulated as a time-indexed sequence of energy-based models, where the score corresponds to the negative gradient of an energy function. As opposed to learning the score directly, an energy parameterization is attractive as the energy itself can be used to control generation via Monte Carlo samplers. Architectural constraints and training instability in energy parameterized models have so far yielded inferior performance compared to directly approximating the score or denoiser. We address these deficiencies by introducing a novel training regime for the energy function through distillation of pre-trained diffusion models, resembling a Helmholtz decomposition of the score vector field. We further showcase the synergies between energy and score by casting the diffusion sampling procedure as a Feynman Kac model where sampling is controlled using potentials from the learnt energy functions. The Feynman Kac model formalism enables composition and low temperature sampling through sequential Monte Carlo.


Sometimes more data can hurt!

#artificialintelligence

On a recent blog post I've discussed a scalable sparse linear regression model I've developed at work. One of it's interesting properties is that it's an interpolating model – meaning it has 0-training error. This is because it's over parameterized and thus can fit the training data perfectly. While 0-training error is usually associated with over-fiting, the model seems to perform pretty well on the test set. Reports of hugely over-parameterized models that seem to not suffer from overfiting (especially in deep learning) have been accumulating in recent years and so the literature on subject.