Goto

Collaborating Authors

 Neural Information Processing Systems


Bounds on the complexity of recurrent neural network implementations of finite state machines

Neural Information Processing Systems

In this paper we shall be concerned with Mealy machines, although our approach can easily be extended to other formulations to yield equivalent results.




Synchronization, oscillations, and 1/f noise in networks of spiking neurons

Neural Information Processing Systems

The model consists of a two-dimensional sheet of leaky integrateand-fire neuronswith feedback connectivity consisting of local excitation andsurround inhibition. Each neuron is independently driven by homogeneous external noise. Spontaneous symmetry breaking occurs, resulting in the formation of "hotspots" of activity inthe network. These localized patterns of excitation appear as clusters that coalesce, disintegrate, or fluctuate in size while simultaneously movingin a random walk constrained by the interaction with other clusters. The emergent cross-correlation functions have a dual structure, with a sharp peak around zero on top of a much broader hill.


Identifying Fault-Prone Software Modules Using Feed-Forward Networks: A Case Study

Neural Information Processing Systems

Functional complexity of a software module can be measured in terms of static complexity metrics of the program text. Classifying softwaremodules, based on their static complexity measures, into different fault-prone categories is a difficult problem in software engineering.This research investigates the applicability of neural network classifiers for identifying fault-prone software modules usinga data set from a commercial software system. A preliminary empiricalcomparison is performed between a minimum distance based Gaussian classifier, a perceptron classifier and a multilayer layer feed-forward network classifier constructed using a modified Cascade-Correlation algorithm. The modified version of the Cascade-Correlation algorithm constrains the growth of the network size by incorporating a cross-validation check during the output layer training phase. Our preliminary results suggest that a multilayer feed-forward network can be used as a tool for identifying fault-pronesoftware modules early during the development cycle. Other issues such as representation of software metrics and selection of a proper training samples are also discussed.


Assessing the Quality of Learned Local Models

Neural Information Processing Systems

An approach is presented to learning high dimensional functions in the case where the learning algorithm can affect the generation of new data. A local modeling algorithm, locally weighted regression, is used to represent the learned function. Architectural parameters of the approach, such as distance metrics, are also localized and become a function of the query point instead of being global. Statistical tests are given for when a local model is good enough and sampling should be moved to a new area. Our methods explicitly deal with the case where prediction accuracy requirements exist during exploration: By gradually shifting a "center of exploration" and controlling the speed of the shift with local prediction accuracy,a goal-directed exploration of state space takes place along the fringes of the current data support until the task goal is achieved.


Use of Bad Training Data for Better Predictions

Neural Information Processing Systems

We show how randomly scrambling the output classes of various fractions of the training data may be used to improve predictive accuracy of a classification algorithm. We present a method for calculating the "noise sensitivity signature" of a learning algorithm which is based on scrambling the output classes. This signature can be used to indicate a good match between the complexity of the classifier and the complexity of the data. Use of noise sensitivity signatures is distinctly different from other schemes to avoid overtraining, suchas cross-validation, which uses only part of the training data, or various penalty functions, which are not data-adaptive. Noise sensitivity signature methods use all of the training data and are manifestly data-adaptive and nonparametric.


Fast Pruning Using Principal Components

Neural Information Processing Systems

In this procedure one transforms variables to a basis in which the covariance isdiagonal and then projects out the low variance directions. While application of PCA to remove input variables is useful in some cases (Leen et al., 1990), there is no guarantee that low variance variables have little effect on error. We propose a saliency measure, based on PCA, that identifies those variables that have the least effect on error. Our proposed Principal Components Pruning algorithm applies this measure to obtain a simple and cheap pruning technique in the context of supervised learning. Fast Pruning Using Principal Components 37 Special Case: PCP in Linear Regression In unbiased linear models, one can bound the bias introduced from pruning the principal degrees of freedom in the model.


Learning Classification with Unlabeled Data

Neural Information Processing Systems

We represent objects with n-dimensional pattern vectors and consider piecewise-linear classifiers consisting of a collection of (labeled) codebook vectors in the space of the input patterns (See Figure 1). The classification boundaries are gi ven by the voronoi tessellation of the codebook vectors. Patterns are said to belong to the class (given by the label) of the codebook vector to which they are closest.


Grammatical Inference by Attentional Control of Synchronization in an Oscillating Elman Network

Neural Information Processing Systems

We show how an "Elman" network architecture, constructed from recurrently connected oscillatory associative memory network modules, can employ selective "attentional" control of synchronization to direct the flow of communication and computation within the architecture to solve a grammatical inference problem. Previously we have shown how the discrete time "Elman" network algorithm can be implemented in a network completely described by continuous ordinary differential equations. The time steps (machine cycles) of the system are implemented by rhythmic variation (clocking) of a bifurcation parameter. In this architecture, oscillation amplitude codes the information content or activity of a module (unit), whereas phase and frequency are used to "softwire" the network. Only synchronized modules communicate by exchanging amplitude information; the activity of non-resonating modules contributes incoherent crosstalk noise. Attentional control is modeled as a special subset of the hidden modules with ouputs which affect the resonant frequencies of other hidden modules. They control synchrony among the other modules and direct the flow of computation (attention) to effect transitions between two subgraphs of a thirteen state automaton which the system emulates to generate a Reber grammar. The internal crosstalk noise is used to drive the required random transitions of the automaton.