Dai, Bo, Dai, Hanjun, He, Niao, Liu, Weiyang, Liu, Zhen, Chen, Jianshu, Xiao, Lin, Song, Le

Variational inference plays a vital role in learning graphical models, especially on large-scale datasets. Much of its success depends on a proper choice of auxiliary distribution class for posterior approximation. However, how to pursue an auxiliary distribution class that achieves both good approximation ability and computation efficiency remains a core challenge. In this paper, we proposed coupled variational Bayes which exploits the primal-dual view of the ELBO with the variational distribution class generated by an optimization procedure, which is termed optimization embedding. Theoretically, we establish an interesting connection to gradient flow and demonstrate the extreme flexibility of this implicit distribution family in the limit sense.

Sensoy, Murat, Kaplan, Lance, Kandemir, Melih

Deterministic neural nets have been shown to learn effective predictors on a wide range of machine learning problems. However, as the standard approach is to train the network to minimize a prediction loss, the resultant model remains ignorant to its prediction confidence. Orthogonally to Bayesian neural nets that indirectly infer prediction uncertainty through weight uncertainties, we propose explicit modeling of the same using the theory of subjective logic. By placing a Dirichlet distribution on the class probabilities, we treat predictions of a neural net as subjective opinions and learn the function that collects the evidence leading to these opinions by a deterministic neural net from data. The resultant predictor for a multi-class classification problem is another Dirichlet distribution whose parameters are set by the continuous output of a neural net. We provide a preliminary analysis on how the peculiarities of our new loss function drive improved uncertainty estimation. We observe that our method achieves unprecedented success on detection of out-of-distribution queries and endurance against adversarial perturbations.

Sensoy, Murat, Kaplan, Lance, Kandemir, Melih

Ojha, Utkarsh, Singh, Krishna Kumar, Hsieh, Cho-Jui, Lee, Yong Jae

E LASTIC-I NFOGAN: U NSUPERVISEDD ISENTANGLED R EPRESENTATIONL EARNING IN I MBALANCEDD ATA Utkarsh Ojha 1, Krishna Kumar Singh 1, Cho-Jui Hsieh 2, and Y ong Jae Lee 1 1 University of California, Davis 2 University of California, Los Angeles A BSTRACT We propose a novel unsupervised generative model, Elastic-InfoGAN, that learns to disentangle object identity from other low-level aspects in class-imbalanced datasets. We first investigate the issues surrounding the assumptions about uniformity made by InfoGAN (Chen et al. (2016)), and demonstrate its ineffectiveness to properly disentangle object identity in imbalanced data. Our key idea is to make the discovery of the discrete latent factor of variation invariant to identity-preserving transformations in real images, and use that as the signal to learn the latent distribution's parameters. Experiments on both artificial (MNIST) and real-world (Y ouTube-Faces) datasets demonstrate the effectiveness of our approach in imbalanced data by: (i) better disentanglement of object identity as a latent factor of variation; and (ii) better approximation of class imbalance in the data, as reflected in the learned parameters of the latent distribution. Recent deep neural network based models such as Generative Adversarial Networks (Goodfellow et al. (2014); Salimans et al. (2016); Radford et al. (2016)) and V ariational Autoen-coders (Kingma & Welling (2014); Higgins et al. (2017)) have led to promising results in generating realistic samples for high-dimensional and complex data such as images. More advanced models show how to discover disentangled representations (Y an et al. (2016); Chen et al. (2016); Tran et al. (2017); Hu et al. (2018); Singh et al. (2019)), in which different latent dimensions can be made to represent independent factors of variation (e.g., pose, identity) in the data (e.g., human faces). InfoGAN (Chen et al. (2016)) in particular, tries to learn an unsupervised disentangled representation by maximizing the mutual information between the discrete or continuous latent variables and the corresponding generated samples. For discrete latent factors (e.g., digit identities), it assumes that they are uniformly distributed in the data, and approximates them accordingly using a fixed uniform categorical distribution. Although this assumption holds true for many existing benchmark datasets (e.g., MNIST LeCun (1998)), real-word data often follows a long-tailed distribution and rarely exhibits perfect balance between the categories.

Liang, Guohua (University of Technology, Sydney)

As growing numbers of real world applications involve imbalanced class distribution or unequal costs for mis- classification errors in different classes, learning from imbalanced class distribution is considered to be one of the most challenging issues in data mining research. This study empirically investigates the sensitivity of bagging predictors with respect to 12 algorithms and 9 levels of class distribution on 14 imbalanced data-sets by using statistical and graphical methods to address the important issue of understanding the effect of vary- ing levels of class distribution on bagging predictors. The experimental results demonstrate that bagging NB and MLP are insensitive to various levels of imbalanced class distribution.