AITopics

Recent research on reinforcement learning has focused on algorithms basedon the principles of Dynamic Programming (DP). One of the most promising areas of application for these algorithms isthe control of dynamical systems, and some impressive results have been achieved. However, there are significant gaps between practice and theory. In particular, there are no con vergence proofsfor problems with continuous state and action spaces, or for systems involving nonlinear function approximators (such as multilayer perceptrons). This paper presents research applying DPbased reinforcement learning theory to Linear Quadratic Regulation (LQR),an important class of control problems involving continuous state and action spaces and requiring a simple type of nonlinear function approximator. We describe an algorithm based on Q-Iearning that is proven to converge to the optimal controller for a large class of LQR problems. We also describe a slightly different algorithm that is only locally convergent to the optimal Q-function, demonstrating one of the possible pitfalls of using a nonlinear function approximator with DPbased learning.

algorithm, artificial intelligence, neural network, (14 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

On the Use of Projection Pursuit Constraints for Training Neural Networks

Intrator, Nathan

Some improved generalization properties are demonstrat.ed

artificial intelligence, neural network, projection index, (15 more...)

Country:

North America > United States > New York (0.15)
North America > United States > California (0.15)
Asia > Middle East > Israel (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Mahoney, J. Jeffrey, Mooney, Raymond J.

Combining Neural and Symbolic Learning to Revise Probabilistic Rule Bases

Recently, both connectionist and symbolic methods have been developed for biasing learning with prior knowledge lFu,1989; Towell et a/., 1990; Ourston and Mooney, 1990]. Most ofthese methods revise an imperfect knowledge base (usually obtained from a domain expert) to fit a set of empirical data. Some of these methods have been successfully applied to real-world tasks, such as recognizing promoter sequences in DNA [Towell et ai., 1990; Ourston and Mooney, 1990]. The results demonstrate that revising an expert-given knowledge base produces more accurate results than learning from training data alone. Inthis paper, we describe the RAPTURE system (Revising Approximate 107 108 Mahoney and Mooney Probabilistic Theories Using Repositories of Examples), which combines connectionist andsymbolic methods to revise both the parameters and structure of a certainty-factor rule base. 2 The Rapture Algorithm

artificial intelligence, expert system, rapture, (13 more...)

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)

Interposing an ontogenetic model between Genetic Algorithms and Neural Networks

Belew, Richard K.

The relationships between learning, development and evolution in Nature is taken seriously, to suggest a model of the developmental process whereby the genotypes manipulated by the Genetic Algorithm (GA)might be expressed to form phenotypic neural networks (NNet) that then go on to learn. ONTOL is a grammar for generating polynomialNNets for time-series prediction. Genomes correspond toan ordered sequence of ONTOL productions and define a grammar that is expressed to generate a NNet. The NNet's weights are then modified by learning, and the individual's prediction error is used to determine GA fitness. A new gene doubling operator appears critical to the formation of new genetic alternatives in the preliminary but encouraging results presented.

genetic algorithm, health & medicine, neural network, (19 more...)

Country:

Europe (0.69)
North America > United States > California > San Diego County (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)

Q-Learning with Hidden-Unit Restarting

Anderson, Charles W.

Platt's resource-allocation network (RAN) (Platt, 1991a, 1991b) is modified for a reinforcement-learning paradigm and to "restart" existing hidden units rather than adding new units. After restarting, unitscontinue to learn via back-propagation. The resulting restart algorithm is tested in a Q-Iearning network that learns to solve an inverted pendulum problem. Solutions are found faster on average with the restart algorithm than without it.

algorithm, artificial intelligence, reinforcement learning, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Judd, Stephen, Munro, Paul W.

Nets with Unreliable Hidden Nodes Learn Error-Correcting Codes

In a multi-layered neural network, anyone of the hidden layers can be viewed as computing a distributed representation of the input. Several "encoder" experiments have shown that when the representation space is small it can be fully used. But computing with such a representation requires completely dependable nodes. In the case where the hidden nodes are noisy and unreliable, we find that error correcting schemes emerge simply by using noisy units during training; random errors injected duringbackpropagation result in spreading representations apart. Average and minimum distances increase with misfire probability, as predicted by coding-theoretic considerations. Furthennore, the effect of this noise is to protect the machine against permanent node failure, thereby potentially extending the useful lifetime of the machine.

artificial intelligence, neural network, probability, (17 more...)

Country: North America > United States > California (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Simard, Patrice, LeCun, Yann, Denker, John S.

Efficient Pattern Recognition Using a New Transformation Distance

Memory-based classification algorithms such as radial basis functions orK-nearest neighbors typically rely on simple distances (Euclidean, dotproduct ...), which are not particularly meaningful on pattern vectors. More complex, better suited distance measures are often expensive and rather ad-hoc (elastic matching, deformable templates). We propose a new distance measure which (a) can be made locally invariant to any set of transformations of the input and (b) can be computed efficiently. We tested the method on large handwritten character databases provided by the Post Office and the NIST. Using invariances with respect to translation, rotation, scaling,shearing and line thickness, the method consistently outperformed all other systems tested on the same databases.

artificial intelligence, post office, tangent distance, (18 more...)

Country: North America > United States (0.94)

Industry:

Government > Post Office (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

Inst., 19600 NW vonNeumann Dr, Beaverton, OR 97006 Abstract We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian),which does not require to even calculate the Hessian. Severalother applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive partof neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.

artificial intelligence, hessian, neural network, (14 more...)

Country:

North America > United States > Oregon > Washington County > Beaverton (0.24)
North America > United States > Massachusetts > Hampshire County (0.14)

Industry: Education (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Siu, Kai-Yeung, Roychowdhury, Vwani, Kailath, Thomas

Computing with Almost Optimal Size Neural Networks

Artificial neural networks are comprised of an interconnected collection of certain nonlinear devices; examples of commonly used devices include linear threshold elements, sigmoidal elements and radial-basis elements. We employ results from harmonic analysis and the theory of rational approximation toobtain almost tight lower bounds on the size (i.e.

artificial intelligence, neural network, threshold circuit, (16 more...)

Country:

North America > United States > Indiana > Tippecanoe County (0.14)
North America > United States > California > Orange County > Irvine (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Das, Sreerupa, Giles, C. Lee, Sun, Guo-Zheng

Using Prior Knowledge in a NNPDA to Learn Context-Free Languages

Language inference and automata induction using recurrent neural networks has gained considerable interest in the recent years. Nevertheless, success of these models hasbeen mostly limited to regular languages. Additional information in form of a priori knowledge has proved important and at times necessary for learning complex languages(Abu-Mostafa 1990; AI-Mashouq and Reed, 1991; Omlin and Giles, 1992; Towell, 1990). They have demonstrated that partial information incorporated in a connectionist model guides the learning process through constraints for efficient learning and better generalization. 'Ve have previously shown that the NNPDA model can learn Deterministic Context 65 66 Das, Giles, and Sun Output

artificial intelligence, neural network, nnpda, (13 more...)

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)