AITopics

Country: North America > Canada (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.56)

arXiv.org Machine LearningNov-16-2018

Weakly Supervised Semantic Image Segmentation with Self-correcting Networks

Ibrahim, Mostafa S., Vahdat, Arash, Macready, William G.

Building a large image dataset with high-quality object masks for semantic segmentation is costly and time consuming. In this paper, we reduce the data preparation cost by leveraging weak supervision in the form of object bounding boxes. To accomplish this, we propose a principled framework that trains a deep convolutional segmentation model that combines a large set of weakly supervised images (having only object bounding box labels) with a small set of fully supervised images (having semantic segmentation labels and box labels). Our framework trains the primary segmentation model with the aid of an ancillary model that generates initial segmentation labels for the weakly supervised instances and a self-correction module that improves the generated labels during training using the increasingly accurate primary model. We introduce two variants of the self-correction module using either linear or convolutional functions. Experiments on the PASCAL VOC 2012 and Cityscape datasets show that our models trained with a small fully supervised set perform similar to, or better than, models trained with a large fully supervised set while requiring ~7x less annotation effort.

computer vision and pattern recognition, deep learning, neural network, (18 more...)

1811.07073

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningSep-28-2018

Improved Gradient-Based Optimization Over Discrete Distributions

Andriyash, Evgeny, Vahdat, Arash, Macready, Bill

In many applications we seek to maximize an expectation with respect to a distribution over discrete variables. Estimating gradients of such objectives with respect to the distribution parameters is a challenging problem. We analyze existing solutions including finite-difference (FD) estimators and continuous relaxation (CR) estimators in terms of bias and variance. We show that the commonly used Gumbel-Softmax estimator is biased and propose a simple method to reduce it. We also derive a simpler piece-wise linear continuous relaxation that also possesses reduced bias. We demonstrate empirically that reduced bias leads to a better performance in variational inference and on binary optimization tasks. Discrete stochastic variables arise naturally for certain types of data, and distributions over discrete variables can be important components of probabilistic models. Eq. (1) is commonly minimized by gradient-based methods, which require estimating the gradient The two main approaches to this problem are score function estimators and pathwise derivative estimators(see Schulman et al. (2015) for an overview). In this paper we focus on pathwise derivative estimators. This approach is applicable to cases in which the stochastic variables can be reparameterized as a function of other parameter-independent random variables, i.e. z g However, for discrete variables the cumulative distribution function (CDF) is discontinuous and reparameterization is not possible.

estimator, neural network, survey article, (17 more...)

1810.00116

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

arXiv.org Machine LearningMay-18-2018

DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors

Vahdat, Arash, Andriyash, Evgeny, Macready, William G.

Boltzmann machines are powerful distributions that have been shown to be an effective prior over binary latent variables in variational autoencoders (VAEs). However, previous methods for training discrete VAEs have used the evidence lower bound and not the tighter importance-weighted bound. We propose two approaches for relaxing Boltzmann machines to continuous distributions that permit training with importance-weighted bounds. These relaxations are based on generalized overlapping transformations and the Gaussian integral trick. Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.

deep learning, neural network, transformation, (16 more...)

1805.07445

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.56)

arXiv.org Machine LearningFeb-13-2018

DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

Vahdat, Arash, Macready, William G., Bian, Zhengbing, Khoshaman, Amir

Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult. We propose a new class of smoothing transformations based on a mixture of two overlapping distributions, and show that the proposed transformation can be used for training binary latent models with either directed or undirected priors. We derive a new variational bound to efficiently train with Boltzmann machine priors. Using this bound, we develop DVAE++, a generative model with a global discrete prior and a hierarchy of convolutional continuous variables. Experiments on several benchmarks show that overlapping transformations outperform other recent continuous relaxations of discrete latent variables including Gumbel-Softmax (Maddison et al., 2016; Jang et al., 2016), and discrete variational autoencoders (Rolfe 2016).

deep learning, latent variable, neural network, (16 more...)

1802.0492

Country: North America > Canada (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Neural Information Processing SystemsDec-31-2017

Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks

Vahdat, Arash

Collecting large training datasets, annotated with high-quality labels, is costly and time-consuming. This paper proposes a novel framework for training deep convolutional neural networks from noisy labeled datasets that can be obtained cheaply. The problem is formulated using an undirected graphical model that represents the relationship between noisy and clean labels, trained in a semi-supervised setting. In our formulation, the inference over latent clean labels is tractable and is regularized during training using auxiliary sources of information. The proposed model is applied to the image labeling problem and is shown to be effective in labeling unseen images as well as reducing label noise in training on CIFAR-10 and MS COCO datasets.

clean label, deep learning, neural network, (15 more...)

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Machine LearningNov-2-2017

Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks

Vahdat, Arash

Collecting large training datasets, annotated with high-quality labels, is costly and time-consuming. This paper proposes a novel framework for training deep convolutional neural networks from noisy labeled datasets that can be obtained cheaply. The problem is formulated using an undirected graphical model that represents the relationship between noisy and clean labels, trained in a semi-supervised setting. In our formulation, the inference over latent clean labels is tractable and is regularized during training using auxiliary sources of information. The proposed model is applied to the image labeling problem and is shown to be effective in labeling unseen images as well as reducing label noise in training on CIFAR-10 and MS COCO datasets.

clean label, deep learning, neural network, (15 more...)

1706.00038

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsDec-31-2013

Latent Maximum Margin Clustering

Zhou, Guang-Tong, Lan, Tian, Vahdat, Arash, Mori, Greg

We present a maximum margin framework that clusters data using latent variables. Using latent representations enables our framework to model unobserved information embedded in the data. We implement our idea by large margin learning, and develop an alternating descent algorithm to effectively solve the resultant non-convex optimization problem. We instantiate our latent maximum margin clustering framework with tag-based video clustering tasks, where each video is represented by a latent tag model describing the presence or absence of video tags. Experimental results obtained on three standard datasets show that the proposed method outperforms non-latent maximum margin clustering as well as conventional clustering approaches.

artificial intelligence, latent variable, machine learning, (17 more...)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Neural Information Processing SystemsDec-31-2012

Kernel Latent SVM for Visual Recognition

Yang, Weilong, Wang, Yang, Vahdat, Arash, Mori, Greg

Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision. However, a limitation of LSVMs is that they rely on linear models. For many computer vision tasks, linear models are suboptimal and nonlinear models learned with kernels typically perform much better. Therefore it is desirable to develop the kernel version of LSVM. In this paper, we propose kernel latent SVM (KLSVM) -- a new learning framework that combines latent SVMs and kernel methods. We develop an iterative training algorithm to learn the model parameters. We demonstrate the effectiveness of KLSVM using three different applications in visual recognition. Our KLSVM formulation is very general and can be applied to solve a wide range of applications in computer vision and machine learning.

artificial intelligence, inductive learning, latent variable, (16 more...)

Country: North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.49)