Goto

Collaborating Authors

 Inductive Learning


Weakly-supervised Discovery of Visual Pattern Configurations

Neural Information Processing Systems

The prominence of weakly labeled data gives rise to a growing demand for object detection methods that can cope with minimal supervision. We propose an approach that automatically identifies discriminative configurations of visual patterns that are characteristic of a given object class. We formulate the problem as a constrained submodular optimization problem and demonstrate the benefits of the discovered configurations in remedying mislocalizations and finding informative positive and negative training examples. Papers published at the Neural Information Processing Systems Conference.


Enforcing balance allows local supervised learning in spiking recurrent networks

Neural Information Processing Systems

To predict sensory inputs or control motor trajectories, the brain must constantly learn temporal dynamics based on error feedback. However, it remains unclear how such supervised learning is implemented in biological neural networks. Learning in recurrent spiking networks is notoriously difficult because local changes in connectivity may have an unpredictable effect on the global dynamics. The most commonly used learning rules, such as temporal back-propagation, are not local and thus not biologically plausible. Furthermore, reproducing the Poisson-like statistics of neural responses requires the use of networks with balanced excitation and inhibition.


Top Rank Optimization in Linear Time

Neural Information Processing Systems

Bipartite ranking aims to learn a real-valued ranking function that orders positive instances before negative instances. Recent efforts of bipartite ranking are focused on optimizing ranking accuracy at the top of the ranked list. Most existing approaches are either to optimize task specific metrics or to extend the rank loss by emphasizing more on the error associated with the top ranked instances, leading to a high computational cost that is super-linear in the number of training instances. We propose a highly efficient approach, titled TopPush, for optimizing accuracy at the top that has computational complexity linear in the number of training instances. We present a novel analysis that bounds the generalization error for the top ranked instances for the proposed approach.


Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning

Neural Information Processing Systems

Effective convolutional neural networks are trained on large sets of labeled data. However, creating large labeled datasets is a very costly and time-consuming task. Semi-supervised learning uses unlabeled data to train a model with higher accuracy when there is a limited set of labeled data available. In this paper, we consider the problem of semi-supervised learning with convolutional neural networks. Techniques such as randomized data augmentation, dropout and random max-pooling provide better generalization and stability for classifiers that are trained using gradient descent.


Learning Mixtures of Submodular Functions for Image Collection Summarization

Neural Information Processing Systems

We address the problem of image collection summarization by learning mixtures of submodular functions. We argue that submodularity is very natural to this problem, and we show that a number of previously used scoring functions are submodular -- a property not explicitly mentioned in these publications. We provide classes of submodular functions capturing the necessary properties of summaries, namely coverage, likelihood, and diversity. To learn mixtures of these submodular functions as scoring functions, we formulate summarization as a supervised learning problem using large-margin structured prediction. Furthermore, we introduce a novel evaluation metric, which we call V-ROUGE, for automatic summary scoring.


Toddler-Inspired Visual Object Learning

Neural Information Processing Systems

Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider. How should a system go about choosing a subset of the possible training examples that still allows for learning accurate, generalizable models? To help address this question, we draw inspiration from a highly efficient practical learning system: the human child. Using head-mounted cameras, eye gaze trackers, and a model of foveated vision, we collected first-person (egocentric) images that represents a highly accurate approximation of the "training data" that toddlers' visual systems collect in everyday, naturalistic learning contexts. We used state-of-the-art computer vision learning models (convolutional neural networks) to help characterize the structure of these data, and found that child data produce significantly better object models than egocentric data experienced by adults in exactly the same environment.


Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Neural Information Processing Systems

The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, Temporal Ensembling becomes unwieldy when learning large datasets. To overcome this problem, we propose Mean Teacher, a method that averages model weights instead of label predictions. As an additional benefit, Mean Teacher improves test accuracy and enables training with fewer labels than Temporal Ensembling.


Learning From Weakly Supervised Data by The Expectation Loss SVM (e-SVM) algorithm

Neural Information Processing Systems

In many situations we have some measurement of confidence on positiveness for a binary label. The positiveness" is a continuous value whose range is a bounded interval. We propose a novel learning algorithm called \emph{expectation loss SVM} (e-SVM) that is devoted to the problems where only the positiveness" instead of a binary label of each training sample is available. Our e-SVM algorithm can also be readily extended to learn segment classifiers under weak supervision where the exact positiveness value of each training example is unobserved. In experiments, we show that the e-SVM algorithm can effectively address the segment proposal classification task under both strong supervision (e.g. the pixel-level annotations are available) and the weak supervision (e.g.


What is the difference between supervised and unsupervised machine learning?

#artificialintelligence

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. Machine learning, the subset of artificial intelligence that teaches computers to perform tasks through examples and experience, is a hot area of research and development. Many of the applications we use daily use machine learning algorithms, including AI assistants, web search and machine translation. Your social media news feed is powered by a machine learning algorithm. The recommended videos you see on YouTube and Netflix are the result of a machine learning model.


Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples

Neural Information Processing Systems

Self-paced learning and hard example mining re-weight training instances to improve learning accuracy. This paper presents two improved alternatives based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD): the variance in predicted probability of the correct class across iterations of mini-batch SGD, and the proximity of the correct class probability to the decision threshold. Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques, such as residual learning, momentum, ADAM, batch normalization, dropout, and distillation. Papers published at the Neural Information Processing Systems Conference.