Neural Control for Rolling Mills: Incorporating Domain Theories to Overcome Data Deficiency

Neural Information Processing Systems

In a Bayesian framework, we give a principled account of how domainspecific priorknowledge such as imperfect analytic domain theories can be optimally incorporated into networks of locally-tuned units: by choosing a specific architecture and by applying a specific training regimen. Our method proved successful in overcoming the data deficiency problem in a large-scale application to devise a neural control for a hot line rolling mill. It achieves in this application significantly higher accuracy than optimally-tuned standard algorithms such as sigmoidal backpropagation, and outperforms the state-of-the-art solution.


Benchmarking Feed-Forward Neural Networks: Models and Measures

Neural Information Processing Systems

Existing metrics for the learning performance of feed-forward neural networks do not provide a satisfactory basis for comparison because the choice of the training epoch limit can determine the results of the comparison. I propose new metrics which have the desirable property of being independent of the training epoch limit. The efficiency measures the yield of correct networks in proportion to the training effort expended. The optimal epoch limit provides the greatest efficiency. The learning performance is modelled statistically, and asymptotic performance is estimated. Implementation details may be found in (Harney, 1992). 1 Introduction The empirical comparison of neural network training algorithms is of great value in the development of improved techniques and in algorithm selection for problem solving. In view of the great sensitivity of learning times to the random starting weights (Kolen and Pollack, 1990), individual trial times such as reported in (Rumelhart, et al., 1986) are almost useless as measures of learning performance.


Learning to Segment Images Using Dynamic Feature Binding

Neural Information Processing Systems

Despite the fact that complex visual scenes contain multiple, overlapping objects, people perform object recognition with ease and accuracy. One operation that facilitates recognition is an early segmentation process in which features of objects are grouped and labeled according to which object theybelong. Current computational systems that perform this operation arebased on predefined grouping heuristics.


The Efficient Learning of Multiple Task Sequences

Neural Information Processing Systems

I present a modular network architecture and a learning algorithm based on incremental dynamic programming that allows a single learning agent to learn to solve multiple Markovian decision tasks (MDTs) with significant transferof learning across the tasks. I consider a class of MDTs, called composite tasks, formed by temporally concatenating a number of simpler, elemental MDTs. The architecture is trained on a set of composite andelemental MDTs. The temporal structure of a composite task is assumed to be unknown and the architecture learns to produce a temporal decomposition.It is shown that under certain conditions the solution of a composite MDT can be constructed by computationally inexpensive modifications of the solutions of its constituent elemental MDTs. 1 INTRODUCTION Most applications of domain independent learning algorithms have focussed on learning single tasks. Building more sophisticated learning agents that operate in complex environments will require handling multiple tasks/goals (Singh, 1992). Research efforton the scaling problem has concentrated on discovering faster learning algorithms, and while that will certainly help, techniques that allow transfer of learning across tasks will be indispensable for building autonomous learning agents that have to learn to solve multiple tasks. In this paper I consider a learning agent that interacts with an external, finite-state, discrete-time, stochastic dynamical environment andfaces multiple sequences of Markovian decision tasks (MDTs).



3D Object Recognition Using Unsupervised Feature Extraction

Neural Information Processing Systems

Gold Center for Neural Science, Brown University Providence, RI 02912, USA Shimon Edelman Dept. of Applied Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel Abstract Intrator (1990) proposed a feature extraction method that is related to recent statistical theory (Huber, 1985; Friedman, 1987), and is based on a biologically motivated model of neuronal plasticity (Bienenstock et al., 1982). This method has been recently applied to feature extraction in the context of recognizing 3D objects from single 2D views (Intrator and Gold, 1991). Here we describe experiments designed to analyze the nature of the extracted features, and their relevance to the theory and psychophysics of object recognition. 1 Introduction Results of recent computational studies of visual recognition (e.g., Poggio and Edelman, 1990)indicate that the problem of recognition of 3D objects can be effectively reformulated in terms of standard pattern classification theory. According to this approach, an object is represented by a few of its 2D views, encoded as clusters in multidimentional space. Recognition of a novel view is then carried out by interpo-460 3D Object Recognition Using Unsupervised Feature Extraction 461 lating among the stored views in the representation space.


Neural Network Analysis of Event Related Potentials and Electroencephalogram Predicts Vigilance

Neural Information Processing Systems

Automated monitoring of vigilance in attention intensive tasks such as air traffic control or sonar operation is highly desirable. As the operator monitorsthe instrument, the instrument would monitor the operator, insuring against lapses. We have taken a first step toward this goal by using feedforwardneural networks trained with backpropagation to interpret event related potentials (ERPs) and electroencephalogram (EEG) associated withperiods of high and low vigilance. The accuracy of our system on an ERP data set averaged over 28 minutes was 96%, better than the 83% accuracy obtained using linear discriminant analysis. Practical vigilance monitoring will require prediction over shorter time periods. We were able to average the ERP over as little as 2 minutes and still get 90% correct prediction of a vigilance measure. Additionally, we achieved similarly good performance using segments of EEG power spectrum as short as 56 sec.


Network Model of State-Dependent Sequencing

Neural Information Processing Systems

A network model with temporal sequencing and state-dependent modulatory featuresis described. The model is motivated by neurocognitive data characterizing different states of waking and sleeping. Computer studies demonstrate how unique states of sequencing can exist within the same network under different aminergic and cholinergic modulatory influences. Relationships between state-dependent modulation, memory, sequencing and learning are discussed.