Plotting

 Country


Non-Intrusive Gaze Tracking Using Artificial Neural Networks

Neural Information Processing Systems

We have developed an artificial neural network based gaze tracking system which can be customized to individual users. Unlike other gaze trackers, which normally require the user to wear cumbersome headgear, or to use a chin rest to ensure head immobility, our system is entirely non-intrusive. Currently, the best intrusive gaze tracking systems are accurate to approximately 0.75 degrees. In our experiments, we have been able to achieve an accuracy of 1.5 degrees, while allowing head mobility. In this paper we present an empirical analysis of the performance of a large number of artificial neural network architectures for this task.


Monte Carlo Matrix Inversion and Reinforcement Learning

Neural Information Processing Systems

We describe the relationship between certain reinforcement learning (RL) methods based on dynamic programming (DP) and a class of unorthodox Monte Carlo methods for solving systems of linear equations proposed in the 1950's. These methods recast the solution of the linear system as the expected value of a statistic suitably defined over sample paths of a Markov chain. The significance of our observations lies in arguments (Curtiss, 1954) that these Monte Carlo methods scale better with respect to state-space size than do standard, iterative techniques for solving systems of linear equations. This analysis also establishes convergence rate estimates. Because methods used in RL systems for approximating the evaluation function of a fixed control policy also approximate solutions to systems of linear equations, the connection to these Monte Carlo methods establishes that algorithms very similar to TD algorithms (Sutton, 1988) are asymptotically more efficient in a precise sense than other methods for evaluating policies. Further, all DPbased RL methods have some of the properties of these Monte Carlo algorithms, which suggests that although RL is often perceived to be slow, for sufficiently large problems, it may in fact be more efficient than other known classes of methods capable of producing the same results.


Efficient Simulation of Biological Neural Networks on Massively Parallel Supercomputers with Hypercube Architecture

Neural Information Processing Systems

We present a neural network simulation which we implemented on the massively parallel Connection Machine 2. In contrast to previous work, this simulator is based on biologically realistic neurons with nontrivial single-cell dynamics, high connectivity with a structure modelled in agreement with biological data, and preservation of the temporal dynamics of spike interactions. We simulate neural networks of 16,384 neurons coupled by about 1000 synapses per neuron, and estimate the performance for much larger systems. Communication between neurons is identified as the computationally most demanding task and we present a novel method to overcome this bottleneck. The simulator has already been used to study the primary visual system of the cat. 1 INTRODUCTION Neural networks have been implemented previously on massively parallel supercomputers (Fujimoto et al., 1992, Zhang et al., 1990). However, these are implementations of artificial, highly simplified neural networks, while our aim was explicitly to provide a simulator for biologically realistic neural networks.


Unsupervised Parallel Feature Extraction from First Principles

Neural Information Processing Systems

We describe a number of learning rules that can be used to train unsupervised parallel feature extraction systems. The learning rules are derived using gradient ascent of a quality function. We consider a number of quality functions that are rational functions of higher order moments of the extracted feature values. We show that one system learns the principle components of the correlation matrix. Principal component analysis systems are usually not optimal feature extractors for classification.


Central and Pairwise Data Clustering by Competitive Neural Networks

Neural Information Processing Systems

Data clustering amounts to a combinatorial optimization problem to reduce the complexity of a data representation and to increase its precision. Central and pairwise data clustering are studied in the maximum entropy framework. For central clustering we derive a set of reestimation equations and a minimization procedure which yields an optimal number of clusters, their centers and their cluster probabilities. A meanfield approximation for pairwise clustering is used to estimate assignment probabilities. A se1fconsistent solution to multidimensional scaling and pairwise clustering is derived which yields an optimal embedding and clustering of data points in a d-dimensional Euclidian space. 1 Introduction A central problem in information processing is the reduction of the data complexity with minimal loss in precision to discard noise and to reveal basic structure of data sets. Data clustering addresses this tradeoff by optimizing a cost function which preserves the original data as complete as possible and which simultaneously favors prototypes with minimal complexity (Linde et aI., 1980; Gray, 1984; Chou et aI., 1989; Rose et ai., 1990). We discuss an objective function for the joint optimization of distortion errors and the complexity of a reduced data representation. A maximum entropy estimation of the cluster assignments yields a unifying framework for clustering algorithms with a number of different distortion and complexity measures. The close analogy of complexity optimized clustering with winner-take-all neural networks suggests a neural-like implementation resembling topological feature maps (see Figure 1).


Cross-Validation Estimates IMSE

Neural Information Processing Systems

Integrated Mean Squared Error (IMSE) is a version of the usual mean squared error criterion, averaged over all possible training sets of a given size. If it could be observed, it could be used to determine optimal network complexity or optimal data subsets for efficient training. We show that two common methods of cross-validating average squared error deliver unbiased estimates of IMSE, converging to IMSE with probability one. These estimates thus make possible approximate IMSE-based choice of network complexity. We also show that two variants of cross validation measure provide unbiased IMSE-based estimates potentially useful for selecting optimal data subsets. 1 Summary To begin, assume we are given a fixed network architecture.



Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming

Neural Information Processing Systems

Dynamic programming provides a methodology to plan trajectories and design controllers and estimators for nonlinear systems. However, general dynamic programming is computationally intractable. We have developed procedures that allow more complex planning problems to be solved. We have modified the State Increment Dynamic Programming approach of Larson (1968) in several ways: 1. In State Increment DP, a constant action is integrated to form a trajectory segment from the center of a cell to its boundary. We use second order local trajectory optimization (Differential Dynamic Programming) to generate an optimal trajectory and form an optimal policy in a tube surrounding the optimal trajectory within a cell. The trajectory segment and local policy are globally optimal, up to the resolution of the representation of the value function on the boundary of the cell.


Convergence of Stochastic Iterative Dynamic Programming Algorithms

Neural Information Processing Systems

Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of these methods has been missing. In this paper we relate DPbased learning algorithms to the powerful techniques of stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD("\) and Q-Iearning belong. 1 INTRODUCTION Learning to predict the future and to find an optimal way of controlling it are the basic goals of learning systems that interact with their environment. A variety of algorithms are currently being studied for the purposes of prediction and control in incompletely specified, stochastic environments. Here we consider learning algorithms defined in Markov environments. There are actions or controls (u) available for the learner that affect both the state transition probabilities, and the probability distribution for the immediate, state dependent costs (Ci(u)) incurred by the learner.


Speaker Recognition Using Neural Tree Networks

Neural Information Processing Systems

A new classifier is presented for text-independent speaker recognition. The new classifier is called the modified neural tree network (MNTN). The NTN is a hierarchical classifier that combines the properties of decision trees and feed-forward neural networks. The MNTN differs from the standard NTN in that a new learning rule based on discriminant learning is used, which minimizes the classification error as opposed to a norm of the approximation error. The MNTN also uses leaf probability measures in addition to the class labels.