Review for NeurIPS paper: Relative gradient optimization of the Jacobian term in unsupervised deep learning

Feb-5-2025, 12:14:47 GMT–Neural Information Processing Systems

Summary and Contributions: Quite a bit of recent research on deep density estimation under the normalizing flows umbrella has focused on efficiently computing (a restricted form of) the Jacobian term that appears in the objective. Such models operate with a set of transformations where the computation of this term is easy. While arbitrary distributions can be learned by such methods, the features that are learned are quite skewered which can prevent learning a proper disentangled representation. This paper presents a conceptually simple method to optimize for exact maximum likelihood in such models. In particular, the authors consider a transform from the observed to the latent space which is parameterized by fully connected networks with the only constraint that the weight matrices are invertible. Since the parameters of the transformation are matrices, the authors use properties of Riemannian geometry of matrix spaces to derive updates in terms of the relative gradient.

gradient, jacobian term, relative gradient optimization, (8 more...)

Neural Information Processing Systems

Feb-5-2025, 12:14:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty (0.59)
  - Machine Learning > Neural Networks
    - Deep Learning (0.40)