Goto

Collaborating Authors

Fast Classification Rates for High-dimensional Gaussian Generative Models

Neural Information Processing Systems

We consider the problem of binary classification when the covariates conditioned on the each of the response values follow multivariate Gaussian distributions. We focus on the setting where the covariance matrices for the two conditional distributions are the same. The corresponding generative model classifier, derived via the Bayes rule, also called Linear Discriminant Analysis, has been shown to behave poorly in high-dimensional settings. We present a novel analysis of the classification error of any linear discriminant approach given conditional Gaussian models. This allows us to compare the generative model classifier, other recently proposed discriminative approaches that directly learn the discriminant function, and then finally logistic regression which is another classical discriminative model classifier.


Characterizing Bias in Classifiers using Generative Models

Neural Information Processing Systems

Models that are learned from real-world data are often biased because the data used to train them is biased. This can propagate systemic human biases that exist and ultimately lead to inequitable treatment of people, especially minorities. To characterize bias in learned classifiers, existing approaches rely on human oracles labeling real-world examples to identify the "blind spots" of the classifiers; these are ultimately limited due to the human labor required and the finite nature of existing image examples. We propose a simulation-based approach for interrogating classifiers using generative adversarial models in a systematic manner. We incorporate a progressive conditional generative model for synthesizing photo-realistic facial images and Bayesian Optimization for an efficient interrogation of independent facial image classification systems.


Fast Classification Rates for High-dimensional Gaussian Generative Models

Neural Information Processing Systems

We consider the problem of binary classification when the covariates conditioned on the each of the response values follow multivariate Gaussian distributions. We focus on the setting where the covariance matrices for the two conditional distributions are the same. The corresponding generative model classifier, derived via the Bayes rule, also called Linear Discriminant Analysis, has been shown to behave poorly in high-dimensional settings. We present a novel analysis of the classification error of any linear discriminant approach given conditional Gaussian models. This allows us to compare the generative model classifier, other recently proposed discriminative approaches that directly learn the discriminant function, and then finally logistic regression which is another classical discriminative model classifier. As we show, under a natural sparsity assumption, and letting $s$ denote the sparsity of the Bayes classifier, $p$ the number of covariates, and $n$ the number of samples, the simple ($\ell_1$-regularized) logistic regression classifier achieves the fast misclassification error rates of $O\left(\frac{s \log p}{n}\right)$, which is much better than the other approaches, which are either inconsistent under high-dimensional settings, or achieve a slower rate of $O\left(\sqrt{\frac{s \log p}{n}}\right)$.


Towards quantitative methods to assess network generative models

arXiv.org Machine Learning

Assessing generative models is not an easy task. Generative models should synthesize graphs which are not replicates of real networks but show topological features similar to real graphs. We introduce an approach for assessing graph generative models using graph classifiers. The inability of an established graph classifier for distinguishing real and synthesized graphs could be considered as a performance measurement for graph generators.


OpenIAS Hybrid Generative-Discriminative Deep Models

@machinelearnbot

Deep discriminative classifiers perform remarkably well on problems with a lot of labeled data. So-called deep generative models tend to excel when labeled training data is scarce. Can we do a hybrid, combining the best of both worlds? In this post I outline a hybrid generative-discriminative deep model loosely based on the importance weighted autoencoder (Burda et al., 2015). Don't miss the pretty pictures.