Goto

Collaborating Authors

 Statistical Learning




Legendre Decomposition for Tensors

Neural Information Processing Systems

CP decomposition compresses an input tensor into a sum of rank-one components, and Tucker decomposition approximates an input tensor by a core tensor multiplied by matrices. To date, matrix and tensor decomposition has been extensively analyzed, and there are a number of variations of such decomposition (Kolda and Bader, 2009), where the common goal is to approximate a given tensor by a smaller number of components, or parameters,inanefficientmanner. However, despite the recent advances of decomposition techniques, a learning theory that can systematically define decomposition for any order tensors including vectors and matrices is still under development. Moreover, it is well known that CP and Tucker tensor decomposition include non-convex optimization and that the global convergence is not guaranteed.




f475bdd151d8b5fa01215aeda925e75c-Paper-Conference.pdf

Neural Information Processing Systems

Weconsider the pool-based activelearning problem, where only asubset ofthe training data is labeled, and the goal is to query a batch of unlabeled samples to be labeled so as to maximally improve model performance.


Robustness of conditional GANs to noisy labels

Neural Information Processing Systems

We study the problem of learning conditional generators from noisy labeled samples, where the labels are corrupted by random noise. A standard training of conditional GANs will not only produce samples with wrong labels, but also generate poor quality samples. We consider two scenarios, depending on whether the noise model is known or not. When the distribution of the noise is known, we introduce a novel architecture which we call Robust Conditional GAN (RCGAN). The main idea is to corrupt the label of the generated sample before feeding to the adversarial discriminator, forcing the generator to produce samples with clean labels. This approach of passing through a matching noisy channel is justified by accompanying multiplicative approximation bounds between the loss of the RCGAN and the distance between the clean real distribution and the generator distribution. This shows that the proposed approach is robust, when used with a carefully chosen discriminator architecture, known as projection discriminator. When the distribution of the noise is not known, we provide an extension of our architecture, which we call RCGAN-U, that learns the noise model simultaneously while training the generator. We show experimentally on MNIST and CIFAR-10 datasets that both the approaches consistently improve upon baseline approaches, and RCGAN-U closely matches the performance of RCGAN.




Posterior Collapse of a Linear Latent Variable Model

Neural Information Processing Systems

This work identifies the existence and cause of a type of posterior collapse that frequently occurs in the Bayesian deep learning practice. For a general linear latent variable model that includes linear variational autoencoders as a special case, we precisely identify the nature of posterior collapse to be the competition between the likelihood and the regularization of the mean due to the prior. Our result suggests that posterior collapse may be related to neural collapse and dimensional collapse and could be a subclass of a general problem of learning for deeper architectures.