Goto

Collaborating Authors

 Perceptrons


On Herding and the Perceptron Cycling Theorem

Neural Information Processing Systems

The paper develops a connection between traditional perceptron algorithms and recently introduced herding algorithms. It is shown that both algorithms can be viewed as an application of the perceptron cycling theorem. This connection strengthens some herding results and suggests new (supervised) herding algorithms that, like CRFs or discriminative RBMs, make predictions by conditioning on the input attributes. We develop and investigate variants of conditional herding, and show that conditional herding leads to practical algorithms that perform better than or on par with related classifiers such as the voted perceptron and the discriminative RBM.


Perceptron Learning of SAT

Neural Information Processing Systems

Boolean satisfiability (SAT) as a canonical NP-complete decision problem is one of the most important problems in computer science. In practice, real-world SAT sentences are drawn from a distribution that may result in efficient algorithms for their solution. Such SAT instances are likely to have shared characteristics and substructures. This work approaches the exploration of a family of SAT solvers as a learning problem. In particular, we relate polynomial time solvability of a SAT subset to a notion of margin between sentences mapped by a feature function into a Hilbert space.


Local Supervised Learning through Space Partitioning Venkatesh Saligrama Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Boston University

Neural Information Processing Systems

We develop a novel approach for supervised learning based on adaptively partitioning the feature space into different regions and learning local region-specific classifiers. We formulate an empirical risk minimization problem that incorporates both partitioning and classification in to a single global objective. We show that space partitioning can be equivalently reformulated as a supervised learning problem and consequently any discriminative learning method can be utilized in conjunction with our approach. Nevertheless, we consider locally linear schemes by learning linear partitions and linear region classifiers. Locally linear schemes can not only approximate complex decision boundaries and ensure low training error but also provide tight control on over-fitting and generalization error. We train locally linear classifiers by using LDA, logistic regression and perceptrons, and so our scheme is scalable to large data sizes and high-dimensions. We present experimental results demonstrating improved performance over state of the art classification techniques on benchmark datasets. We also show improved robustness to label noise.


d9fc5b73a8d78fad3d6dffe419384e70-Reviews.html

Neural Information Processing Systems

Overview This paper proposes an algorithm for learning general structured predictors (e.g., non-linear). This is done by replacing the structured hinge loss with its smooth dual LP relaxation and observing that optimizing over classifiers reduces to a logistic regression task. Therefore, the learning problem can be extended to cases where this optimization over the class of predictors can be solved efficiently. Specifically, the paper shows how this enables learning predictors like decision trees and multi-layer perceptrons in addition to the common linear classifiers. Pros * The observation made by the authors about reduction of the learning objective to a logistic regression problem seems novel and interesting.


d81f9c1be2e08964bf9f24b15f0e4900-Reviews.html

Neural Information Processing Systems

This paper proposes a neural network architecture that falls somewhere between multilayer perceptrons (MLPs) and sigmoid belief networks (SBNs). The motivation is to permit multimodal predictive distributions (like SBNs) by using stochastic hidden units, but adds deterministic hidden units to smooth the predictive distribution in the case of real-valued data. The paper's main technical contribution is an EM-style algorithm where the E-step uses importance sampling to approximate the posterior and the M-step uses backpropagation to update the parameters. The experiments demonstrate the model's utility on several synthetic and real datasets. Quality: I liked this paper; the use of stochastic and deterministic units seems reasonably justified.


Learning Stochastic Feedforward Neural Networks

Neural Information Processing Systems

Multilayer perceptrons (MLPs) or neural networks are popular models used for nonlinear regression and classification tasks. As regressors, MLPs model the conditional distribution of the predictor variables Y given the input variables X. However, this predictive distribution is assumed to be unimodal (e.g.


Perfect Associative Learning with Spike-Timing-Dependent Plasticity

Neural Information Processing Systems

Recent extensions of the Perceptron as the Tempotron and the Chronotron suggest that this theoretical concept is highly relevant for understanding networks of spiking neurons in the brain. It is not known, however, how the computational power of the Perceptron might be accomplished by the plasticity mechanisms of real synapses. Here we prove that spike-timing-dependent plasticity having an anti-Hebbian form for excitatory synapses as well as a spike-timing-dependent plasticity of Hebbian shape for inhibitory synapses are sufficient for realizing the original Perceptron Learning Rule if these respective plasticity mechanisms act in concert with the hyperpolarisation of the post-synaptic neurons. We also show that with these simple yet biologically realistic dynamics Tempotrons and Chronotrons are learned. The proposed mechanism enables incremental associative learning from a continuous stream of patterns and might therefore underly the acquisition of long term memories in cortex. Our results underline that learning processes in realistic networks of spiking neurons depend crucially on the interactions of synaptic plasticity mechanisms with the dynamics of participating neurons.


Predtron: A Family of Online Algorithms for General Prediction Problems

Neural Information Processing Systems

Modern prediction problems arising in multilabel learning and learning to rank pose unique challenges to the classical theory of supervised learning. These problems have large prediction and label spaces of a combinatorial nature and involve sophisticated loss functions. We offer a general framework to derive mistake driven online algorithms and associated loss bounds. The key ingredients in our framework are a general loss function, a general vector space representation of predictions, and a notion of margin with respect to a general norm. Our general algorithm, Predtron, yields the perceptron algorithm and its variants when instantiated on classic problems such as binary classification, multiclass classification, ordinal regression, and multilabel classification. For multilabel ranking and subset ranking, we derive novel algorithms, notions of margins, and loss bounds. A simulation study confirms the behavior predicted by our bounds and demonstrates the flexibility of the design choices in our framework.


The Return of the Gating Network: Combining Generative Models and Discriminative Training in Natural Image Priors

Neural Information Processing Systems

In recent years, approaches based on machine learning have achieved state-of-theart performance on image restoration problems. Successful approaches include both generative models of natural images as well as discriminative training of deep neural networks. Discriminative training of feed forward architectures allows explicit control over the computational cost of performing restoration and therefore often leads to better performance at the same cost at run time. In contrast, generative models have the advantage that they can be trained once and then adapted to any image restoration task by a simple use of Bayes' rule. In this paper we show how to combine the strengths of both approaches by training a discriminative, feed-forward architecture to predict the state of latent variables in a generative model of natural images. We apply this idea to the very successful Gaussian Mixture Model (GMM) of natural images. We show that it is possible to achieve comparable performance as the original GMM but with two orders of magnitude improvement in run time while maintaining the advantage of generative models.


Quantum Perceptron Models

Neural Information Processing Systems

We demonstrate how quantum computation can provide non-trivial improvements in the computational and statistical complexity of the perceptron model. We develop two quantum algorithms for perceptron learning. The first algorithm exploits quantum information processing to determine a separating hyperplane using a number of steps sublinear in the number of data points N, namely O( N).