Goto

Collaborating Authors

 latent distribution


Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks

Neural Information Processing Systems

Deep neural networks suffer from over-fitting and catastrophic forgetting when trained with small data. One natural remedy for this problem is data augmentation, which has been recently shown to be effective. However, previous works either assume that intra-class variances can always be generalized to new classes, or employ naive generation methods to hallucinate finite examples without modeling their latent distributions. In this work, we propose Covariance-Preserving Adversarial Augmentation Networks to overcome existing limits of low-shot learning. Specifically, a novel Generative Adversarial Network is designed to model the latent distribution of each novel class given its related base counterparts. Since direct estimation on novel classes can be inductively biased, we explicitly preserve covariance information as the ``variability'' of base examples during the generation process. Empirical results show that our model can generate realistic yet diverse examples, leading to substantial improvements on the ImageNet benchmark over the state of the art.



Learning Disentangled Joint Continuous and Discrete Representations

Emilien Dupont

Neural Information Processing Systems

Itcomeswiththeadvantages ofVAEs, such asstable training, largesample diversity and aprincipled inference network, while having the flexibility to model a combination of continuous and discrete generative factors.



Complexity Matters: Rethinking the Latent Space for Generative Modeling

Neural Information Processing Systems

Our investigation starts with the classic generative adversarial networks (GANs). Inspired by the GAN training objective, we propose a novel "distance" between the latent and data distributions, whose minimization





Mutual information and task-relevant latent dimensionality

Gulati, Paarth, Abdelaleem, Eslam, Sederberg, Audrey, Nemenman, Ilya

arXiv.org Machine Learning

Estimating the dimensionality of the latent representation needed for prediction-- the task-relevant dimension--is a difficult, largely unsolved problem with broad scientific applications. We cast it as an Information Bottleneck question: what embedding bottleneck dimension is sufficient to compress predictor and predicted views while preserving their mutual information (MI). We show that standard neural estimators with separable/bilinear critics systematically inflate the inferred dimension, and we address this by introducing a hybrid critic that retains an explicit dimensional bottleneck while allowing flexible nonlinear cross-view interactions, thereby preserving the latent geometry. We further propose a one-shot protocol that reads off the effective dimension from a single over-parameterized hybrid model, without sweeping over bottleneck sizes. We validate the approach on synthetic problems with known task-relevant dimension. We extend the approach to intrinsic dimensionality by constructing paired views of a single dataset, enabling comparison with classical geometric dimension estimators. In noisy regimes where those estimators degrade, our approach remains reliable. Finally, we demonstrate the utility of the method on multiple physics datasets. Before "low-dimensional latent embeddings" became a rallying cry of AI, they were already a basic aim of science: identify a low-dimensional state--a small set of degrees of freedom constructed from observations--that suffices to predict the quantities of interest. The long road from Aristotelian to Newtonian mechanics illustrates that determining the number of such state variables--the relevant latent dimensionality--can be hard, even before one argues about the right variables or the laws that relate them.