AITopics

While the network loading problem for 2-layer threshold nets is NPhard when learning from examples alone (as with backpropagation), (Baum,91) has now proved that a learner can employ queries to evade the hidden unit credit assignment problem and PACload nets with up to four hidden units in polynomial time. Empirical tests show that the method can also learn far more complicated functions such as randomly generated networks with 200 hidden units. The algorithm easily approximates Wieland's 2-spirals function usinga single layer of 50 hidden units, and requires only 30 minutes of CPU time to learn 200-bit parity to 99.7% accuracy.

artificial intelligence, constructing hidden unit, neural network, (18 more...)

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Modeling Time Varying Systems Using Hidden Control Neural Architecture

Levin, Esther

This paper introduces a generalization of the layered neural network that can implement a time-varying nonlinear mapping between its observable input and output. The variation of the network's mapping is due to an additional, hidden control input, while the network parameters remain unchanged. We proposed an algorithm for finding the network parameters and the hidden control sequence from a training set of examples of observable input and output. This algorithm implements an approximate maximum likelihood estimation of parameters of an equivalent statistical model, when only the dominant control sequence is taken into account. The conceptual difference between the proposed model and the HMM is that in the HMM approach, the observable data in each of the states is modeled as though it was produced by a memoryless source, and a parametric description of this source is obtained during training, while in the proposed model the observations in each state are produced by a nonlinear dynamical system driven by noise, and both the parametric form of the dynamics and the noise are estimated. The perfonnance of the model was illustrated for the tasks of nonlinear time-varying system modeling and continuously spoken digit recognition. The reported results show the potential of this model for providing high performance speech recognition capability. Acknowledgment Specialthanks are due to N. Merhav for numerous comments and helpful discussions.

artificial intelligence, neural network, sequence, (14 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

A Neural Network Approach for Three-Dimensional Object Recognition

Tresp, Volker

Many machine vision IYlteDll and, to a large extent, & lao the human visual Iyatem, are model bued.

artificial intelligence, line segment, neural network, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Munro, Paul W., Tabasko, Mary

Translating Locative Prepositions

The features used in the spatial representations were abstracted from Herskovits (1986). The network was trained using the generalized delta rule (Rumelhart, Hinton, and Williams, 1986) on a set of patterns with four components, three syntactic and one semantic. The syntactic components are a pair of nouns separated by a locative preposition [NI-LP-N21, and the semantic component is a representation of the spatial relationship [SR1.

machine translation, neural network, preposition, (19 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.49)

Stochastic Neurodynamics

Cowan, J.D.

The main point of this paper is that stochastic neural networks have a mathematical structure that corresponds quite closely with that of quantum field theory. Neural network Liouvillians and Lagrangians can be derived, just as can spin Hamiltonians and Lagrangians in QFf. It remains to show the efficacy of such a description.

artificial intelligence, machine learning, transition, (14 more...)

Country: North America > United States > Illinois (0.15)

Industry: Materials > Chemicals (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Legendre, Geraldine, Miyata, Yoshiro, Smolensky, Paul

Distributed Recursive Structure Processing

Harmonic grammar (Legendre, et al., 1990) is a connectionist theory of linguistic well-formed ness based on the assumption that the well-formedness of a sentence can be measured by the harmony (negative energy) of the corresponding connectionist state. Assuming a lower-level connectionist network that obeys a few general connectionist principles but is otherwise unspecified, we construct a higher-level network with an equivalent harmony function that captures the most linguistically relevant global aspects of the lower level network. In this paper, we extend the tensor product representation (Smolensky 1990) to fully recursive representations of recursively structured objects like sentences in the lower-level network. We show theoretically and with an example the power of the new technique for parallel distributed structure processing.

artificial intelligence, neural network, representation, (19 more...)

Country: North America > United States > Colorado > Boulder County > Boulder (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Rohwer, Richard

The performance of seven minimization algorithms are compared on five neural network problems. These include a variable-step-size algorithm, conjugate gradient, and several methods with explicit analytic or numerical approximations to the Hessian.

algorithm, artificial intelligence, neural network, (14 more...)

Country:

North America > United States > California (0.14)
Europe (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Remarks on Interpolation and Recognition Using Neural Nets

Sontag, Eduardo D.

We consider different types of single-hidden-Iayer feedforward nets: with or without direct input to output connections, and using either threshold or sigmoidal activation functions. The main results show that direct connections in threshold nets double the recognition but not the interpolation power, while using sigmoids rather than thresholds allows (at least) doubling both. Various results are also given on VC dimension and other measures of recognition capabilities.

artificial intelligence, direct connection, neural network, (14 more...)

Country:

North America > United States (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Discovering Discrete Distributed Representations with Iterative Competitive Learning

Mozer, Michael C.

Competitive learning is an unsupervised algorithm that classifies input patterns into mutually exclusive clusters. In a neural net framework, each cluster is represented by a processing unit that competes with others in a winnertake-all pool for an input pattern. I present a simple extension to the algorithm that allows it to construct discrete, distributed representations. Discrete representations are useful because they are relatively easy to analyze and their information content can readily be measured. Distributed representations are useful because they explicitly encode similarity. The basic idea is to apply competitive learning iteratively to an input pattern, and after each stage to subtract from the input pattern the component that was captured in the representation at that stage. This component is simply the weight vector of the winning unit of the competitive pool. The subtraction procedure forces competitive pools at different stages to encode different aspects of the input. The algorithm is essentially the same as a traditional data compression technique known as multistep vector quantization, although the neural net perspective suggests potentially powerful extensions to that approach.

artificial intelligence, competitive unit, neural network, (17 more...)