North America
Constructing Proofs in Symmetric Networks
This paper considers the problem of expressing predicate calculus in connectionist networks that are based on energy minimization. Given a firstorder-logic knowledge base and a bound k, a symmetric network is constructed (like a Boltzman machine or a Hopfield network) that searches for a proof for a given query. If a resolution-based proof of length no longer than k exists, then the global minima of the energy function that is associated with the network represent such proofs. The network that is generated is of size cubic in the bound k and linear in the knowledge size. There are no restrictions on the type of logic formulas that can be represented.
Network Model of State-Dependent Sequencing
Sutton, Jeffrey P., Mamelak, Adam N., Hobson, J. Allan
A network model with temporal sequencing and state-dependent modulatory features is described. The model is motivated by neurocognitive data characterizing different states of waking and sleeping. Computer studies demonstrate how unique states of sequencing can exist within the same network under different aminergic and cholinergic modulatory influences. Relationships between state-dependent modulation, memory, sequencing and learning are discussed.
Data Analysis using G/SPLINES
G/SPLINES is an algorithm for building functional models of data. It uses genetic search to discover combinations of basis functions which are then used to build a least-squares regression model. Because it produces a population of models which evolve over time rather than a single model, it allows analysis not possible with other regression-based approaches. 1 INTRODUCTION G/SPLINES is a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm (Friedman, 1990) with Holland's Genetic Algorithm (Holland, 1975). G/SPLINES has advantages over MARS in that it requires fewer least-squares computations, is easily extendable to non-spline basis functions, may discover models inaccessible to local-variable selection algorithms, and allows significantly larger problems to be considered. These issues are discussed in (Rogers, 1991). This paper begins with a discussion of linear regression models, followed by a description of the G/SPLINES algorithm, and finishes with a series of experiments illustrating its performance, robustness, and analysis capabilities.
Time-Warping Network: A Hybrid Framework for Speech Recognition
Levin, Esther, Pieraccini, Roberto, Bocchieri, Enrico
Such systems attempt to combine the best features of both models: the temporal structure of HMMs and the discriminative power of neural networks. In this work we define a time-warping (1W) neuron that extends the operation of the fonnal neuron of a back-propagation network by warping the input pattern to match it optimally to its weights. We show that a single-layer network of TW neurons is equivalent to a Gaussian density HMMbased recognition system.
Information Measure Based Skeletonisation
Ramachandran, Sowmya, Pratt, Lorien Y.
Automatic determination of proper neural network topology by trimming oversized networks is an important area of study, which has previously been addressed using a variety of techniques. In this paper, we present Information Measure Based Skeletonisation (IMBS), a new approach to this problem where superfluous hidden units are removed based on their information measure (1M). This measure, borrowed from decision tree induction techniques, reflects the degree to which the hyperplane formed by a hidden unit discriminates between training data classes. We show the results of applying IMBS to three classification tasks and demonstrate that it removes a substantial number of hidden units without significantly affecting network performance.
Active Exploration in Dynamic Environments
Thrun, Sebastian B., Mรถller, Knut
Many real-valued connectionist approaches to learning control realize exploration by randomness in action selection. This might be disadvantageous when costs are assigned to "negative experiences". The basic idea presented in this paper is to make an agent explore unknown regions in a more directed manner. This is achieved by a so-called competence map, which is trained to predict the controller's accuracy, and is used for guiding exploration. Based on this, a bistable system enables smoothly switching attention between two behaviors - exploration and exploitation - depending on expected costs and knowledge gain. The appropriateness of this method is demonstrated by a simple robot navigation task.
Perturbing Hebbian Rules
Dayan, Peter, Goodhill, Geoffrey
Feedforward networks composed of units which compute a sigmoidal function of a weighted sum of their inputs have been much investigated. We tested the approximation and estimation capabilities of networks using functions more complex than sigmoids. Three classes of functions were tested: polynomials, rational functions, and flexible Fourier series. Unlike sigmoids, these classes can fit nonmonotonic functions. They were compared on three problems: prediction of Boston housing prices, the sunspot count, and robot arm inverse dynamics. The complex units attained clearly superior performance on the robot arm problem, which is a highly nonmonotonic, pure approximation problem. On the noisy and only mildly nonlinear Boston housing and sunspot problems, differences among the complex units were revealed; polynomials did poorly, whereas rationals and flexible Fourier series were comparable to sigmoids. 1 Introduction
A Connectionist Learning Approach to Analyzing Linguistic Stress
Gupta, Prahlad, Touretzky, David S.
We use connectionist modeling to develop an analysis of stress systems in terms of ease of learnability. In traditional linguistic analyses, learnability arguments determine default parameter settings based on the feasibilty of logicall y deducing correct settings from an initial state. Our approach provides an empirical alternative to such arguments. Based on perceptron learning experiments using data from nineteen human languages, we develop a novel characterization of stress patterns in terms of six parameters. These provide both a partial description of the stress pattern itself and a prediction of its learnability, without invoking abstract theoretical constructs such as metrical feet. This work demonstrates that machine learning methods can provide a fresh approach to understanding linguistic phenomena.
Optical Implementation of a Self-Organizing Feature Extractor
Anderson, Dana Z., Benkert, Claus, Hebler, Verena, Jang, Ju-Seog, Montgomery, Don, Saffman, Mark
We demonstrate a self-organizing system based on photorefractive ring oscillators. We employ the system in two ways that can both be thought of as feature extractors; one acts on a set of images exposed repeatedly to the system strictly as a linear feature extractor, and the other serves as a signal demultiplexer for fiber optic communications. Both systems implement unsupervised competitive learning embedded within the mode interaction dynamics between the modes of a set of ring oscillators. After a training period, the modes of the rings become associated with the different image features or carrier frequencies within the incoming data stream.
Learning in Feedforward Networks with Nonsmooth Functions
Redding, Nicholas J., Downs, T.
This paper is concerned with the problem of learning in networks where some or all of the functions involved are not smooth. Examples of such networks are those whose neural transfer functions are piecewise-linear and those whose error function is defined in terms of the 100 norm. Up to now, networks whose neural transfer functions are piecewise-linear have received very little consideration in the literature, but the possibility of using an error function defined in terms of the 100 norm has received some attention. In this paper we draw upon some recent results from the field of nonsmooth optimization (NSO) to present an algorithm for the non smooth case. Our motivation for this work arose out of the fact that we have been able to show that, in backpropagation, an error function based upon the 100 norm overcomes the difficulties which can occur when using the 12 norm. 1 INTRODUCTION This paper is concerned with the problem of learning in networks where some or all of the functions involved are not smooth.