AITopics

The subject of this paper is the integration of multi-layered Artificial Neural Networks(ANN) with probability density functions such as Gaussian mixtures found in continuous density Hidden Markov Models (HMM). In the first part of this paper we present an ANN/HMM hybrid in which all the parameters of the the system are simultaneously optimized with respect to a single criterion. In the second part of this paper, we study the relationship between the density of the inputs of the network and the density of the outputs of the networks. A few experiments are presented to explore how to perform density estimation with ANNs. 1 INTRODUCTION This paper studies the integration of Artificial Neural Networks (ANN) with probability densityfunctions (pdf) such as the Gaussian mixtures often used in continuous density Hidden Markov Models. The ANNs considered here are multi-layered or recurrent networks with hyperbolic tangent hidden units.

artificial intelligence, experiment, machine learning, (12 more...)

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Bayesian Model Comparison and Backprop Nets

MacKay, David J. C.

The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and type of weight decay terms; (3) quantified estimates of the error bars on network parameters and on network output. The framework also generates ameasure of the effective number of parameters determined by the data. The relationship of Bayesian model comparison to recent work on prediction ofgeneralisation ability (Guyon et al., 1992, Moody, 1992) is discussed.

artificial intelligence, machine learning, occam factor, (14 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Best-First Model Merging for Dynamic Learning and Recognition

Omohundro, Stephen M.

Stephen M. Omohundro International Computer Science Institute 1947 CenteJ' Street, Suite 600 Berkeley, California 94704 Abstract "Best-first model merging" is a general technique for dynamically choosing the structure of a neural or related architecture while avoiding overfitting.It is applicable to both leaming and recognition tasks and often generalizes significantly better than fixed structures. We demonstrate theapproach applied to the tasks of choosing radial basis functions for function learning, choosing local affine models for curve and constraint surface modelling, and choosing the structure of a balltree or bumptree to maximize efficiency of access. 1 TOWARD MORE COGNITIVE LEARNING Standard backpropagation neural networks learn in a way which appears to be quite different fromhuman leaming. Viewed as a cognitive system, a standard network always maintains acomplete model of its domain. This model is mostly wrong initially, but gets gradually better and better as data appears. The net deals with all data in much the same way and has no representation for the strength of evidence behind a certain conclusion. The network architecture is usually chosen before any data is seen and the processing is much the same in the early phases of learning as in the late phases.

artificial intelligence, best-first model merging, machine learning, (16 more...)

Country: North America > United States > California > Alameda County > Berkeley (0.24)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Freund, Yoav, Haussler, David

Unsupervised learning of distributions on binary vectors using two layer networks

We study a particular type of Boltzmann machine with a bipartite graph structure called a harmonium. Ourinterest is in using such a machine to model a probability distribution on binary input vectors. We analyze the class of probability distributions that can be modeled by such machines.

artificial intelligence, harmonium model, machine learning, (17 more...)

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)

Smyth, Padhraic, Mellstrom, Jeff

Fault Diagnosis of Antenna Pointing Systems using Hybrid Neural Network and Signal Processing Models

Padhraic Smyth, J eft" Mellstrom Jet Propulsion Laboratory 238-420 California Institute of Technology Pasadena, CA 91109 Abstract We describe in this paper a novel application of neural networks to system health monitoring of a large antenna for deep space communications. The paper outlines our approach to building a monitoring system using hybrid signal processing and neural network techniques, including autoregressive modelling, pattern recognition, and Hidden Markov models. We discuss several problems which are somewhat generic in applications of this kind - in particular we address the problem of detecting classes which were not present in the training data. Experimental results indicate that the proposed system is sufficiently reliable for practical implementation. 1 Background: The Deep Space Network The Deep Space Network (DSN) (designed and operated by the Jet Propulsion Laboratory (JPL)for the National Aeronautics and Space Administration (NASA)) is unique in terms of ...

artificial intelligence, machine learning, training data, (14 more...)

Country: North America > United States > California > Los Angeles County > Pasadena (0.24)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Government > Space Agency (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Röscheisen, Martin, Hofmann, Reimar, Tresp, Volker

Neural Control for Rolling Mills: Incorporating Domain Theories to Overcome Data Deficiency

In a Bayesian framework, we give a principled account of how domainspecific priorknowledge such as imperfect analytic domain theories can be optimally incorporated into networks of locally-tuned units: by choosing a specific architecture and by applying a specific training regimen. Our method proved successful in overcoming the data deficiency problem in a large-scale application to devise a neural control for a hot line rolling mill. It achieves in this application significantly higher accuracy than optimally-tuned standard algorithms such as sigmoidal backpropagation, and outperforms the state-of-the-art solution.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Genre: Research Report > Promising Solution (0.35)

Industry: Materials > Metals & Mining (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Cooper, Paul R., Prokopowicz, Peter N.

Markov Random Fields Can Bridge Levels of Abstraction

Network vision systems must make inferences from evidential information acrosslevels of representational abstraction, from low level invariants, through intermediate scene segments, to high level behaviorally relevant object descriptions. This paper shows that such networks can be realized as Markov Random Fields (MRFs). We show first how to construct an MRF functionally equivalent to a Hough transform parameter network, thus establishing a principled probabilistic basis for visual networks. Second, weshow that these MRF parameter networks are more capable and flexible than traditional methods. In particular, they have a well-defined probabilistic interpretation, intrinsically incorporate feedback, and offer richer representations and decision capabilities.

artificial intelligence, machine learning, markov random field, (15 more...)

Country: Asia > Japan (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.75)

Haffner, Patrick, Waibel, Alex

Multi-State Time Delay Networks for Continuous Speech Recognition

We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition.

artificial intelligence, machine learning, procedure, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > Canada > Quebec > Montreal (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.95)

A practical Bayesian framework for back-propagation networks

MacKay, D. J. C.

ClassicsFeb-1-1992

A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained.

bayesian inference, machine learning, practical bayesian framework, (3 more...)

Classics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

A Bayesian model of plan recognition

Charniak, E. | Goldman, R.

ClassicsFeb-1-1992

We argue that the problem of plan recognition, inferring an agent's plan from observations, is largely a problem of inference under conditions of uncertainty. We present an approach to the plan recognition problem that is based on Bayesian probability theory. In attempting to solve a plan recognition problem we first retrieve candidate explanations. These explanations (sometimes only the most promising ones) are assembled into a plan recognition Bayesian network, which is a representation of a probability distribution over the set of possible explanations. We perform Bayesian updating to choose the most likely interpretation for the set of observed actions.

artificial intelligence, machine learning, planning & scheduling, (5 more...)

Classics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)