Europe
Rapid Quality Estimation of Neural Network Input Representations
Cherkauer, Kevin J., Shavlik, Jude W.
However, ANNs are usually costly to train, preventing one from trying many different representations. In this paper, we address this problem by introducing and evaluating three new measures for quickly estimating ANN input representation quality. Two of these, called [DBleaves and Min (leaves), consistently outperform Rendell and Ragavan's (1993) blurring measure in accurately ranking different input representations for ANN learning on three difficult, real-world datasets.
Harmony Networks Do Not Work
Harmony networks have been proposed as a means by which connectionist models can perform symbolic computation. Indeed, proponents claim that a harmony network can be built that constructs parse trees for strings in a context free language. This paper shows that harmony networks do not work in the following sense: they construct many outputs that are not valid parse trees. In order to show that the notion of systematicity is compatible with connectionism, Paul Smolensky, Geraldine Legendre and Yoshiro Miyata (Smolensky, Legendre, and Miyata 1992; Smolen sky 1993; Smolen sky, Legendre, and Miyata 1994) proposed a mechanism, "Harmony Theory," by which connectionist models purportedly perform structure sensitive operations without implementing classical algorithms. Harmony theory describes a "harmony network" which, in the course of reaching a stable equilibrium, apparently computes parse trees that are valid according to the rules of a particular context-free grammar.
A Model of Spatial Representations in Parietal Cortex Explains Hemineglect
Pouget, Alexandre, Sejnowski, Terrence J.
We have recently developed a theory of spatial representations in which the position of an object is not encoded in a particular frame of reference but, instead, involves neurons computing basis functions of their sensory inputs. This type of representation is able to perform nonlinear sensorimotor transformations and is consistent with the response properties of parietal neurons. We now ask whether the same theory could account for the behavior of human patients with parietal lesions. These lesions induce a deficit known as hemineglect that is characterized by a lack of reaction to stimuli located in the hemispace contralateral to the lesion. A simulated lesion in a basis function representation was found to replicate three of the most important aspects of hemineglect: i) The models failed to cross the leftmost lines in line cancellation experiments, ii) the deficit affected multiple frames of reference and, iii) it could be object centered. These results strongly support the basis function hypothesis for spatial representations and provide a computational theory of hemineglect at the single cell level. 1 Introduction According to current theories of spatial representations, the positions of objects are represented in multiple modules throughout the brain, each module being specialized for a particular sensorimotor transformation and using its own frame of reference. For instance, the lateral intraparietal area (LIP) appears to encode the location of objects in oculocentric coordinates, presumably for the control of saccadic eye movements.
Generalized Learning Vector Quantization
We propose a new learning method, "Generalized Learning Vector Quantization (GLVQ)," in which reference vectors are updated based on the steepest descent method in order to minimize the cost function. The cost function is determined so that the obtained learning rule satisfies the convergence condition. We prove that Kohonen's rule as used in LVQ does not satisfy the convergence condition and thus degrades recognition ability. Experimental results for printed Chinese character recognition reveal that GLVQ is superior to LVQ in recognition ability.
Active Gesture Recognition using Learned Visual Attention
Darrell, Trevor, Pentland, Alex
We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Using visionroutines previously implemented for an interactive environment, wedetermine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture. 1 INTRODUCTION Vision has numerous uses in the natural world.
Learning Sparse Perceptrons
Jackson, Jeffrey C., Craven, Mark
We introduce a new algorithm designed to learn sparse perceptrons overinput representations which include high-order features. Our algorithm, which is based on a hypothesis-boosting method, is able to PAClearn a relatively natural class of target concepts. Moreover, the algorithm appears to work well in practice: on a set of three problem domains, the algorithm produces classifiers that utilize small numbers of features yet exhibit good generalization performance. Perhaps most importantly, our algorithm generates concept descriptions that are easy for humans to understand. 1 Introduction Multi-layer perceptron (MLP) learning is a powerful method for tasks such as concept classification.However, in many applications, such as those that may involve scientific discovery, it is crucial to be able to explain predictions. Multi-layer perceptrons arelimited in this regard, since their representations are notoriously difficult for humans to understand.
Gaussian Processes for Regression
Williams, Christopher K. I., Rasmussen, Carl Edward
The Bayesian analysis of neural networks is difficult because a simple priorover weights implies a complex prior distribution over functions. In this paper we investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis forfixed values of hyperparameters to be carried out exactly using matrix operations. Two methods, using optimization and averaging (viaHybrid Monte Carlo) over hyperparameters have been tested on a number of challenging problems and have produced excellent results. 1 INTRODUCTION In the Bayesian approach to neural networks a prior distribution over the weights induces a prior distribution over functions. This prior is combined with a noise model, which specifies the probability of observing the targets t given function values y, to yield a posterior over functions which can then be used for predictions. For neural networks the prior over functions has a complex form which means that implementations must either make approximations (e.g.
Generalisation of A Class of Continuous Neural Networks
Shawe-Taylor, John, Zhao, Jieyu
More recently attempts have been made to introduce some computational cost related to the accuracy of the computations [5].The model proposed in this paper weakens the computational power still further by relying on classical boolean circuits to perform the computation using asimple encoding of the real values. Using this encoding we also show that Teo circuits interpreted in the model correspond to a Neural Network design referred toas Bit Stream Neural Networks, which have been developed for hardware implementation [8]. With the perspective afforded by the general approach considered here, we are also able to analyse the Bit Stream Neural Networks (or indeed any other adaptive system basedon the technique), giving VC dimension and sample size bounds for PAC learning.
Sample Complexity for Learning Recurrent Perceptron Mappings
DasGupta, Bhaskar, Sontag, Eduardo D.
Recurrent perceptron classifiers generalize the classical perceptron model. They take into account those correlations and dependences among input coordinates which arise from linear digital filtering. This paper provides tight bounds on sample complexity associated to the fitting of such models to experimental data. 1 Introduction One of the most popular approaches to binary pattern classification, underlying many statistical techniques, is based on perceptrons or linear discriminants; see for instance the classical reference (Duda and Hart, 1973).