Statistical Learning
Legendre Decomposition for Tensors
Mahito Sugiyama, Hiroyuki Nakahara, Koji Tsuda
CP decomposition compresses an input tensor into a sum of rank-one components, and Tucker decomposition approximates an input tensor by a core tensor multiplied by matrices. To date, matrix and tensor decomposition has been extensively analyzed, and there are a number of variations of such decomposition (Kolda and Bader, 2009), where the common goal is to approximate a given tensor by a smaller number of components, or parameters,inanefficientmanner. However, despite the recent advances of decomposition techniques, a learning theory that can systematically define decomposition for any order tensors including vectors and matrices is still under development. Moreover, it is well known that CP and Tucker tensor decomposition include non-convex optimization and that the global convergence is not guaranteed.
Robustness of conditional GANs to noisy labels
Kiran K. Thekumparampil, Ashish Khetan, Zinan Lin, Sewoong Oh
We study the problem of learning conditional generators from noisy labeled samples, where the labels are corrupted by random noise. A standard training of conditional GANs will not only produce samples with wrong labels, but also generate poor quality samples. We consider two scenarios, depending on whether the noise model is known or not. When the distribution of the noise is known, we introduce a novel architecture which we call Robust Conditional GAN (RCGAN). The main idea is to corrupt the label of the generated sample before feeding to the adversarial discriminator, forcing the generator to produce samples with clean labels. This approach of passing through a matching noisy channel is justified by accompanying multiplicative approximation bounds between the loss of the RCGAN and the distance between the clean real distribution and the generator distribution. This shows that the proposed approach is robust, when used with a carefully chosen discriminator architecture, known as projection discriminator. When the distribution of the noise is not known, we provide an extension of our architecture, which we call RCGAN-U, that learns the noise model simultaneously while training the generator. We show experimentally on MNIST and CIFAR-10 datasets that both the approaches consistently improve upon baseline approaches, and RCGAN-U closely matches the performance of RCGAN.
Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion
Gavin Zhang, University of Illinois at Urbana–Champaign, jialun2@illinois.edu, "3026 Hong-Ming Chiu, University of Illinois at Urbana–Champaign, hmchiu2@illinois.edu, "3026 Richard Y. Zhang, University of Illinois at Urbana–Champaign, ryz@illinois.edu
Posterior Collapse of a Linear Latent Variable Model
This work identifies the existence and cause of a type of posterior collapse that frequently occurs in the Bayesian deep learning practice. For a general linear latent variable model that includes linear variational autoencoders as a special case, we precisely identify the nature of posterior collapse to be the competition between the likelihood and the regularization of the mean due to the prior. Our result suggests that posterior collapse may be related to neural collapse and dimensional collapse and could be a subclass of a general problem of learning for deeper architectures.