Goto

Collaborating Authors

 Information Technology



A General Purpose Image Processing Chip: Orientation Detection

Neural Information Processing Systems

The generalization ability of a neural network can sometimes be improved dramatically by regularization. To analyze the improvement oneneeds more refined results than the asymptotic distribution ofthe weight vector. Here we study the simple case of one-dimensional linear regression under quadratic regularization, i.e., ridge regression. We study the random design, misspecified case, where we derive expansions for the optimal regularization parameter andthe ensuing improvement. It is possible to construct examples where it is best to use no regularization.


Regularisation in Sequential Learning Algorithms

Neural Information Processing Systems

In this paper, we discuss regularisation in online/sequential learning algorithms.In environments where data arrives sequentially, techniques such as cross-validation to achieve regularisation or model selection are not possible. Further, bootstrapping to determine aconfidence level is not practical. To surmount these problems, a minimum variance estimation approach that makes use of the extended Kalman algorithm for training multi-layer perceptrons isemployed. The novel contribution of this paper is to show the theoretical links between extended Kalman filtering, Sutton's variable learning rate algorithms and Mackay's Bayesian estimation framework.In doing so, we propose algorithms to overcome the need for heuristic choices of the initial conditions and noise covariance matrices in the Kalman approach.


A Superadditive-Impairment Theory of Optic Aphasia

Neural Information Processing Systems

Farah (1990) has proposed an alternative classof explanations involving partial damage to multiple pathways. Weexplore this explanation for optic aphasia, a disorder in which severe perfonnance deficits are observed when patients are asked to name visually presented objects, but surprisingly, performance is relatively nonnalon naming objects from auditory cues and on gesturing the appropriate use of visually presented objects.


Regression with Input-dependent Noise: A Gaussian Process Treatment

Neural Information Processing Systems

The prior can be obtained by placing prior distributions on the weights in a neural 494 P. W Goldberg, C. K. L Williams and C. M. Bishop network, although we would argue that it is perhaps more natural to place priors directly overfunctions. One tractable way of doing this is to create a Gaussian process prior. This has the advantage that predictions can be made from the posterior using only matrix multiplication for fixed hyperparameters and a global noise level. In contrast, for neural networks (with fixed hyperparameters and a global noise level) it is necessary to use approximations or Markov chain Monte Carlo (MCMC) methods. Rasmussen(1996) has demonstrated that predictions obtained with Gaussian processes are as good as or better than other state-of-the art predictors. In much of the work on regression problems in the statistical and neural networks literatures, it is assumed that there is a global noise level, independent of the input vector x. The book by Bishop (1995) and the papers by Bishop (1994), MacKay (1995) and Bishop and Qazaz (1997) have examined the case of input-dependent noise for parametric models such as neural networks.


Two Approaches to Optimal Annealing

Neural Information Processing Systems

The latter studies are based on examining the Kramers Moyal expansion of the master equation for the weight space probability densities. A different approach, based on the deterministic dynamics of macroscopic quantities called order parameters, has been recently presented [6, 7].


Relative Loss Bounds for Multidimensional Regression Problems

Neural Information Processing Systems

We study online generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax functionthat need to consider the linear activations to all the output neurons. We use distance functions of a certain kind in two completely independent roles in deriving and analyzing online learning algorithms for such tasks. We use one distance function to define a matching loss function for the (possibly multidimensional) transfer function, which allows usto generalize earlier results from one-dimensional to multidimensional outputs.We use another distance function as a tool for measuring progress made by the online updates. This shows how previously studied algorithmssuch as gradient descent and exponentiated gradient fit into a common framework. We evaluate the performance of the algorithms usingrelative loss bounds that compare the loss of the online algoritm to the best off-line predictor from the relevant model class, thus completely eliminating probabilistic assumptions about the data.


A Neural Network Based Head Tracking System

Neural Information Processing Systems

We have constructed an inexpensive video based motorized tracking system that learns to track a head. It uses real time graphical user inputs or an auxiliary infrared detector as supervisory signals to train a convolutional neural network. The inputs to the neural network consist of normalized luminance and chrominance images and motion information from frame differences. Subsampled images are also used to provide scale invariance. During the online training phases the neural network rapidly adjusts the input weights depending up on the reliability of the different channels in the surrounding environment. This quick adaptation allows the system to robustly track a head even when other objects are moving within a cluttered background.


A Neural Network Model of Naive Preference and Filial Imprinting in the Domestic Chick

Neural Information Processing Systems

Filial imprinting in domestic chicks is of interest in psychology, biology, and computational modeling because it exemplifies simple, rapid, innately programmedlearning which is biased toward learning about some objects. Hom et al. have recently discovered a naive visual preference for heads and necks which develops over the course of the first three days of life. The neurological basis of this predisposition is almost entirely unknown;that of imprinting-related learning is fairly clear. This project is the first model of the predisposition consistent with what is known about learning in imprinting. The model develops the predisposition appropriately,learns to "approach" a training object, and replicates one interaction between the two processes. Future work will replicate more interactions between imprinting and the predisposition in chicks, and analyze why the system works.


Intrusion Detection with Neural Networks

Neural Information Processing Systems

Intrusion detection schemes can be classified into two categories: misuse and anomaly intrusion detection. Misuse refers to known attacks that exploit the known vulnerabilities of the system. Anomaly means unusual activity in general that could indicate an intrusion.