AITopics

Department of Mathematics University of California, San Diego La Jolla, CA 92093-0112 Abstract We propose diffusion networks, a type of recurrent neural network with probabilistic dynamics, as models for learning natural signals that are continuous in time and space. We give a formula for the gradient of the log-likelihood of a path with respect to the drift parameters for a diffusion network. This gradient can be used to optimize diffusion networks in the nonequilibrium regime for a wide variety of problems paralleling techniques which have succeeded in engineering fields such as system identification, state estimation and signal filtering. An aspect of this work which is of particular interest to computational neuroscience and hardware design is that with a suitable choice of activation function, e.g., quasi-linear sigmoidal, the gradient formula is local in space and time. 1 Introduction Many natural signals, like pixel gray-levels, line orientations, object position, velocity and shape parameters, are well described as continuous-time continuous-valued stochastic processes; however, the neural network literature has seldom explored the continuous stochastic case. Since the solutions to many decision theoretic problems of interest are naturally formulated using probability distributions, it is desirable to have a flexible framework for approximating probability distributions on continuous path spaces.

diffusion network, gradient, learning path distribution, (14 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.25)
North America > United States > California > San Diego County > La Jolla (0.25)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Meila, Marina, Jordan, Michael I.

Estimating Dependency Structure as a Hidden Variable

This paper introduces a probability model, the mixture of trees that can account for sparse, dynamically changing dependence relationships. We present a family of efficient algorithms that use EM and the Minimum Spanning Tree algorithm to find the ML and MAP mixture of trees for a variety of priors, including the Dirichlet and the MDL priors.

algorithm, basic algorithm, dependency structure, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.06)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)

An Application of Reversible-Jump MCMC to Multivariate Spherical Gaussian Mixtures

Marrs, Alan D.

Applications of Gaussian mixture models occur frequently in the fields of statistics and artificial neural networks.

mixture component, mixture model, model order, (14 more...)

Country: Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Hofmann, Reimar, Tresp, Volker

Nonlinear Markov Networks for Continuous Variables

We address the problem oflearning structure in nonlinear Markov networks with continuous variables. This can be viewed as non-Gaussian multidimensional density estimation exploiting certain conditional independencies in the variables. Markov networks are a graphical way of describing conditional independencies well suited to model relationships which do not exhibit a natural causal ordering. We use neural network structures to model the quantitative relationships between variables. The main focus in this paper will be on learning the structure for the purpose of gaining insight into the underlying process. Using two data sets we show that interesting structures can be found using our approach. Inference will be briefly addressed.

boston housing data, markov boundary, markov network, (12 more...)

Country:

Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Japan (0.04)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Frey, Brendan J., MacKay, David J. C.

A Revolution: Belief Propagation in Graphs with Cycles

Until recently, artificial intelligence researchers have frowned upon the application of probability propagation in Bayesian belief networks that have cycles. The probability propagation algorithm is only exact in networks that are cycle-free. However, it has recently been discovered that the two best error-correcting decoding algorithms are actually performing probability propagation in belief networks with cycles. 1 Communicating over a noisy channel Our increasingly wired world demands efficient methods for communicating bits of information over physical channels that introduce errors. Examples of real-world channels include twisted-pair telephone wires, shielded cable-TV wire, fiberoptic cable, deep-space radio, terrestrial radio, and indoor radio. Engineers attempt to correct the errors introduced by the noise in these channels through the use of channel coding which adds protection to the information source, so that some channel errors can be corrected.

bayesian network, information bit, probability propagation, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Illinois (0.04)
(4 more...)

Industry:

Media > Television (0.54)
Leisure & Entertainment (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Freitas, João F. G. de, Niranjan, Mahesan, Gee, Andrew H.

Regularisation in Sequential Learning Algorithms

In this paper, we discuss regularisation in online/sequential learning algorithms. In environments where data arrives sequentially, techniques such as cross-validation to achieve regularisation or model selection are not possible. Further, bootstrapping to determine a confidence level is not practical. To surmount these problems, a minimum variance estimation approach that makes use of the extended Kalman algorithm for training multi-layer perceptrons is employed. The novel contribution of this paper is to show the theoretical links between extended Kalman filtering, Sutton's variable learning rate algorithms and Mackay's Bayesian estimation framework. In doing so, we propose algorithms to overcome the need for heuristic choices of the initial conditions and noise covariance matrices in the Kalman approach.

algorithm, ekf algorithm, posterior density function, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)
North America > United States > California > San Mateo County > San Mateo (0.04)
Africa > South Africa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Bishop, Christopher M., Lawrence, Neil D., Jaakkola, Tommi, Jordan, Michael I.

Approximating Posterior Distributions in Belief Networks Using Mixtures

Exact inference in densely connected Bayesian networks is computationally intractable, and so there is considerable interest in developing effective approximation schemes. One approach which has been adopted is to bound the log likelihood using a mean-field approximating distribution. While this leads to a tractable algorithm, the mean field distribution is assumed to be factorial and hence unimodal. In this paper we demonstrate the feasibility of using a richer class of approximating distributions based on mixtures of mean field distributions. We derive an efficient algorithm for updating the mixture parameters and apply it to the problem of learning in sigmoid belief networks. Our results demonstrate a systematic improvement over simple mean field theory as the number of mixture components is increased.

approximating posterior distribution, hlm, log likelihood, (11 more...)

Country:

Asia > Middle East > Jordan (0.07)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Bengio, Yoshua, Bengio, Samy, Isabelle, Jean-Franc, Singer, Yoram

Shared Context Probabilistic Transducers

Recently, a model for supervised learning of probabilistic transducers represented by suffix trees was introduced. However, this algorithm tends to build very large trees, requiring very large amounts of computer memory. In this paper, we propose anew, more compact, transducer model in which one shares the parameters of distributions associated to contexts yielding similar conditional output distributions. We illustrate the advantages of the proposed algorithm with comparative experiments on inducing a noun phrase recogmzer.

algorithm, node, transducer, (13 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Barber, David, Schottky, Bernhard

Radial Basis Functions: A Bayesian Treatment

Bayesian methods have been successfully applied to regression and classification problems in multi-layer perceptrons. We present a novel application of Bayesian techniques to Radial Basis Function networks by developing a Gaussian approximation to the posterior distribution which, for fixed basis function widths, is analytic in the parameters. The setting of regularization constants by crossvalidation is wasteful as only a single optimal parameter estimate is retained. We treat this issue by assigning prior distributions to these constants, which are then adapted in light of the data under a simple re-estimation formula. 1 Introduction Radial Basis Function networks are popular regression and classification tools[lO]. For fixed basis function centers, RBFs are linear in their parameters and can therefore be trained with simple one shot linear algebra techniques[lO]. The use of unsupervised techniques to fix the basis function centers is, however, not generally optimal since setting the basis function centers using density estimation on the input data alone takes no account of the target values associated with that data. Ideally, therefore, we should include the target values in the training procedure[7, 3, 9]. Unfortunately, allowing centers to adapt to the training targets leads to the RBF being a nonlinear function of its parameters, and training becomes more problematic. Most methods that perform supervised training of RBF parameters minimize the ·Present address: SNN, University of Nijmegen, Geert Grooteplein 21, Nijmegen, The Netherlands.

procedure, radial basis function, regularization constant, (13 more...)

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.45)
North America > United States > New York (0.04)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sahani, Maneesh, Pezaris, John S., Andersen, Richard A.

On the Separation of Signals from Neighboring Cells in Tetrode Recordings

We discuss a solution to the problem of separating waveforms produced by multiple cells in an extracellular neural recording. We take an explicitly probabilistic approach, using latent-variable models of varying sophistication to describe the distribution of waveforms produced by a single cell. The models range from a single Gaussian distribution of waveforms for each cell to a mixture of hidden Markov models. We stress the overall statistical structure of the approach, allowing the details of the generative model chosen to depend on the specific neural preparation.

algorithm, separation, waveform, (17 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)