Undirected Networks
Nonlinear Markov Networks for Continuous Variables
Hofmann, Reimar, Tresp, Volker
We address the problem oflearning structure in nonlinear Markov networks with continuous variables. This can be viewed as non-Gaussian multidimensional density estimation exploiting certain conditional independencies in the variables. Markov networks are a graphical way of describing conditional independencies well suited to model relationships which do not exhibit a natural causal ordering. We use neural network structures to model the quantitative relationships between variables. The main focus in this paper will be on learning the structure for the purpose of gaining insight into the underlying process. Using two data sets we show that interesting structures can be found using our approach. Inference will be briefly addressed.
Shared Context Probabilistic Transducers
Bengio, Yoshua, Bengio, Samy, Isabelle, Jean-Franc, Singer, Yoram
Recently, a model for supervised learning of probabilistic transducers represented by suffix trees was introduced. However, this algorithm tends to build very large trees, requiring very large amounts of computer memory. In this paper, we propose anew, more compact, transducer model in which one shares the parameters of distributions associated to contexts yielding similar conditional output distributions. We illustrate the advantages of the proposed algorithm with comparative experiments on inducing a noun phrase recogmzer.
On the Separation of Signals from Neighboring Cells in Tetrode Recordings
Sahani, Maneesh, Pezaris, John S., Andersen, Richard A.
We discuss a solution to the problem of separating waveforms produced by multiple cells in an extracellular neural recording. We take an explicitly probabilistic approach, using latent-variable models of varying sophistication to describe the distribution of waveforms produced by a single cell. The models range from a single Gaussian distribution of waveforms for each cell to a mixture of hidden Markov models. We stress the overall statistical structure of the approach, allowing the details of the generative model chosen to depend on the specific neural preparation.
Nonlinear Markov Networks for Continuous Variables
Hofmann, Reimar, Tresp, Volker
We address the problem oflearning structure in nonlinear Markov networks with continuous variables. This can be viewed as non-Gaussian multidimensional densityestimation exploiting certain conditional independencies in the variables. Markov networks are a graphical way of describing conditional independencieswell suited to model relationships which do not exhibit a natural causal ordering. We use neural network structures to model the quantitative relationships between variables.
How to Dynamically Merge Markov Decision Processes
Singh, Satinder P., Cohn, David
We are frequently called upon to perform multiple tasks that compete forour attention and resource. Often we know the optimal solution to each task in isolation; in this paper, we describe how this knowledge can be exploited to efficiently find good solutions for doing the tasks in parallel. We formulate this problem as that of dynamically merging multiple Markov decision processes (MDPs) into a composite MDP, and present a new theoretically-sound dynamic programmingalgorithm for finding an optimal policy for the composite MDP. We analyze various aspects of our algorithm and illustrate its use on a simple merging problem. Every day, we are faced with the problem of doing mUltiple tasks in parallel, each of which competes for our attention and resource. If we are running a job shop, we must decide which machines to allocate to which jobs, and in what order, so that no jobs miss their deadlines. If we are a mail delivery robot, we must find the intended recipients of the mail while simultaneously avoiding fixed obstacles (such as walls) and mobile obstacles (such as people), and still manage to keep ourselves sufficiently charged up. Frequently we know how to perform each task in isolation; this paper considers how we can take the information we have about the individual tasks and combine it to efficiently find an optimal solution for doing the entire set of tasks in parallel. More importantly, we describe a theoretically-sound algorithm for doing this merging dynamically; new tasks (such as a new job arrival at a job shop) can be assimilated online into the solution being found for the ongoing set of simultaneous tasks.
An Improved Policy Iteration Algorithm for Partially Observable MDPs
A new policy iteration algorithm for partially observable Markov decision processes is presented that is simpler and more efficient than an earlier policy iteration algorithm of Sondik (1971,1978). The key simplification is representation of a policy as a finite-state controller. This representation makes policy evaluation straightforward. The paper's contributionis to show that the dynamic-programming update used in the policy improvement step can be interpreted as the transformation ofa finite-state controller into an improved finite-state controller. The new algorithm consistently outperforms value iteration as an approach to solving infinite-horizon problems.
Modeling Acoustic Correlations by Factor Analysis
Saul, Lawrence K., Rahim, Mazin G.
Hidden Markov models (HMMs) for automatic speech recognition rely on high dimensional feature vectors to summarize the shorttime propertiesof speech. Correlations between features can arise when the speech signal is non-stationary or corrupted by noise. We investigate how to model these correlations using factor analysis, a statistical method for dimensionality reduction. Factor analysis uses a small number of parameters to model the covariance structure ofhigh dimensional data. These parameters are estimated by an Expectation-Maximization (EM) algorithm that can be embedded inthe training procedures for HMMs.
Analysis of Drifting Dynamics with Neural Network Hidden Markov Models
Kohlmorgen, Jens, Mรผller, Klaus-Robert, Pawelzik, Klaus
We present a method for the analysis of nonstationary time series withmultiple operating modes. In particular, it is possible to detect and to model both a switching of the dynamics and a less abrupt, time consuming drift from one mode to another. This is achieved in two steps. First, an unsupervised training method provides predictionexperts for the inherent dynamical modes. Then, the trained experts are used in a hidden Markov model that allows to model drifts. An application to physiological wake/sleep data demonstrates that analysis and modeling of real-world time series can be improved when the drift paradigm is taken into account.
Learning Path Distributions Using Nonequilibrium Diffusion Networks
Mineiro, Paul, Movellan, Javier R., Williams, Ruth J.
Department of Mathematics University of California, San Diego La Jolla, CA 92093-0112 Abstract We propose diffusion networks, a type of recurrent neural network with probabilistic dynamics, as models for learning natural signals that are continuous in time and space. We give a formula for the gradient of the log-likelihood of a path with respect to the drift parameters for a diffusion network. This gradient can be used to optimize diffusion networks in the nonequilibrium regime for a wide variety of problems paralleling techniques which have succeeded in engineering fields such as system identification, state estimation and signal filtering. An aspect of this work which is of particular interestto computational neuroscience and hardware design is that with a suitable choice of activation function, e.g., quasi-linear sigmoidal, the gradient formula is local in space and time. 1 Introduction Many natural signals, like pixel gray-levels, line orientations, object position, velocity andshape parameters, are well described as continuous-time continuous-valued stochastic processes; however, the neural network literature has seldom explored the continuous stochastic case. Since the solutions to many decision theoretic problems of interest are naturally formulated using probability distributions, it is desirable to have a flexible framework for approximating probability distributions on continuous pathspaces.
Shared Context Probabilistic Transducers
Bengio, Yoshua, Bengio, Samy, Isabelle, Jean-Franc, Singer, Yoram
Recently, a model for supervised learning of probabilistic transducers representedby suffix trees was introduced. However, this algorithm tendsto build very large trees, requiring very large amounts of computer memory. In this paper, we propose anew, more compact, transducermodel in which one shares the parameters of distributions associatedto contexts yielding similar conditional output distributions. We illustrate the advantages of the proposed algorithm withcomparative experiments on inducing a noun phrase recogmzer.