Goto

Collaborating Authors

 Industry


VLSI Implementation of Cortical Visual Motion Detection Using an Analog Neural Computer

Neural Information Processing Systems

Two dimensional image motion detection neural networks have been implemented using a general purpose analog neural computer. The neural circuits perform spatiotemporal feature extraction based on the cortical motion detection model of Adelson and Bergen. The neural computer provides the neurons, synapses and synaptic time-constants required to realize the model in VLSI hardware. Results show that visual motion estimation can be implemented with simple sum-andthreshold neural hardware with temporal computational capabilities. The neural circuits compute general 20 visual motion in real-time.


Probabilistic Interpretation of Population Codes

Neural Information Processing Systems

We present a theoretical framework for population codes which generalizes naturally to the important case where the population provides information about a whole probability distribution over an underlying quantity rather than just a single value. We use the framework to analyze two existing models, and to suggest and evaluate a third model for encoding such probability distributions.


Early Brain Damage

Neural Information Processing Systems

Optimal Brain Damage (OBD) is a method for reducing the number of weights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid approximation if the learning algorithm has converged into a local minimum. On the other hand it is often desirable to terminate the learning process before a local minimum is reached (early stopping). In this paper we show that OBD estimates the increase in cost function incorrectly if the network is not in a local minimum. We also show how OBD can be extended such that it can be used in connection with early stopping. We call this new approach Early Brain Damage, EBD. EBD also allows to revive already pruned weights. We demonstrate the improvements achieved by EBD using three publicly available data sets.


Monotonicity Hints

Neural Information Processing Systems

A hint is any piece of side information about the target function to be learned. We consider the monotonicity hint, which states that the function to be learned is monotonic in some or all of the input variables. The application of mono tonicity hints is demonstrated on two real-world problems-a credit card application task, and a problem in medical diagnosis. A measure of the monotonicity error of a candidate function is defined and an objective function for the enforcement of monotonicity is derived from Bayesian principles. We report experimental results which show that using monotonicity hints leads to a statistically significant improvement in performance on both problems.


A Convergence Proof for the Softassign Quadratic Assignment Algorithm

Neural Information Processing Systems

The softassign quadratic assignment algorithm has recently emerged as an effective strategy for a variety of optimization problems in pattern recognition and combinatorial optimization. While the effectiveness of the algorithm was demonstrated in thousands of simulations, there was no known proof of convergence. Here, we provide a proof of convergence for the most general form of the algorithm.


Adaptive On-line Learning in Changing Environments

Neural Information Processing Systems

An adaptive online algorithm extending the learning of learning idea is proposed and theoretically motivated. Relying only on gradient flow information it can be applied to learning continuous functions or distributions, even when no explicit loss function is given and the Hessian is not available. Its efficiency is demonstrated for a non-stationary blind separation task of acoustic signals.


Competition Among Networks Improves Committee Performance

Neural Information Processing Systems

Since a neural network predictor inherently has an excessive number of parameters, reducing the prediction error is usually done by reducing variance. Methods for reducing neural network complexity can be viewed as a regularization technique to reduce this variance. Examples of such methods are Optimal Brain Damage (Le Cun et.


A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data

Neural Information Processing Systems

We address statistical classifier design given a mixed training set consisting of a small labelled feature set and a (generally larger) set of unlabelled features. This situation arises, e.g., for medical images, where although training features may be plentiful, expensive expertise is required to extract their class labels. We propose a classifier structure and learning algorithm that make effective use of unlabelled data to improve performance. The learning is based on maximization of the total data likelihood, i.e. over both the labelled and unlabelled data subsets. Two distinct EM learning algorithms are proposed, differing in the EM formalism applied for unlabelled data. The classifier, based on a joint probability model for features and labels, is a "mixture of experts" structure that is equivalent to the radial basis function (RBF) classifier, but unlike RBFs, is amenable to likelihood-based training. The scope of application for the new method is greatly extended by the observation that test data, or any new data to classify, is in fact additional, unlabelled data - thus, a combined learning/classification operation - much akin to what is done in image segmentation - can be invoked whenever there is new data to classify. Experiments with data sets from the UC Irvine database demonstrate that the new learning algorithms and structure achieve substantial performance gains over alternative approaches.


Combinations of Weak Classifiers

Neural Information Processing Systems

To obtain classification systems with both good generalization performance and efficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers are linear classifiers (perceptrons) which can do a little better than making random guesses. A randomized algorithm is proposed to find the weak classifiers. They· are then combined through a majority vote. As demonstrated through systematic experiments, the method developed is able to obtain combinations of weak classifiers with good generalization performance and a fast training time on a variety of test problems and real applications.


Balancing Between Bagging and Bumping

Neural Information Processing Systems

We compare different methods to combine predictions from neural networks trained on different bootstrap samples of a regression problem. One of these methods, introduced in [6] and which we here call balancing, is based on the analysis of the ensemble generalization error into an ambiguity term and a term incorporating generalization performances of individual networks. We show how to estimate these individual errors from the residuals on validation patterns. Weighting factors for the different networks follow from a quadratic programming problem. On a real-world problem concerning the prediction of sales figures and on the well-known Boston housing data set, balancing clearly outperforms other recently proposed alternatives as bagging [1] and bumping [8]. 1 EARLY STOPPING AND BOOTSTRAPPING Stopped training is a popular strategy to prevent overfitting in neural networks.