Undirected Networks
An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition
This paper presents a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences de(cid:173) scribing the same event. It is based on two other Markovian models, namely Asynchronous Input/ Output Hidden Markov Models and Pair Hidden Markov Models. An EM algorithm to train the model is presented, as well as a Viterbi decoder that can be used to ob(cid:173) tain the optimal state sequence as well as the alignment between the two sequences. The model has been tested on an audio-visual speech recognition task using the M2VTS database and yielded robust performances under various noise conditions.
Forward-Decoding Kernel-Based Phone Recognition
Forward decoding kernel machines (FDKM) combine large-margin clas(cid:173) sifiers with hidden Markov models (HMM) for maximum a posteriori (MAP) adaptive sequence estimation. State transitions in the sequence are conditioned on observed data using a kernel-based probability model trained with a recursive scheme that deals effectively with noisy and par(cid:173) tially labeled data. Training over very large data sets is accomplished us(cid:173) ing a sparse probabilistic support vector machine (SVM) model based on quadratic entropy, and an on-line stochastic steepest descent algorithm. For speaker-independent continuous phone recognition, FDKM trained over 177,080 samples of the TlMIT database achieves 80.6% recognition accuracy over the full test set, without use of a prior phonetic language model.
Value-Directed Compression of POMDPs
We examine the problem of generating state-space compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compres- sions on decision quality, observing that compressions that allow accurate policy evaluation (prediction of expected future reward) will not affect decision qual- ity. We derive a set of sufficient conditions that ensure accurate prediction in this respect, illustrate interesting mathematical properties these confer on lossless lin- ear compressions, and use these to derive an iterative procedure for finding good linear lossy compressions. We also elaborate on how structured representations of a POMDP can be used to find such compressions.
Markov Models for Automated ECG Interval Analysis
We examine the use of hidden Markov and hidden semi-Markov mod- els for automatically segmenting an electrocardiogram waveform into its constituent waveform features. An undecimated wavelet transform is used to generate an overcomplete representation of the signal that is more appropriate for subsequent modelling. We show that the state dura- tions implicit in a standard hidden Markov model are ill-suited to those of real ECG features, and we investigate the use of hidden semi-Markov models for improved state duration modelling.
Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis
In applying Hidden Markov Models to the analysis of massive data streams, it is often necessary to use an arti(cid:12)cially reduced set of states; this is due in large part to the fact that the basic HMM estimation algorithms have a quadratic dependence on the size of the state set. We present algorithms that reduce this computational bottleneck to linear or near-linear time, when the states can be embedded in an underlying grid of parameters. This type of state representation arises in many domains; in particular, we show an application to tra(cid:14)c analysis at a high-volume Web site.
Max-Margin Markov Networks
In typical classification tasks, we seek a function which assigns a label to a sin- gle object. Kernel-based approaches, such as support vector machines (SVMs), which maximize the margin of confidence of the classifier, are the method of choice for many such tasks. Their popularity stems both from the ability to use high-dimensional feature spaces, and from their strong theoretical guaran- tees. However, many real-world tasks involve sequential, spatial, or structured data, where multiple labels must be assigned. Existing kernel-based methods ig- nore structure in the problem, assigning labels independently to each object, los- ing much useful information. Conversely, probabilistic graphical models, such as Markov networks, can represent correlations between labels, by exploiting problem structure, but cannot handle high-dimensional feature spaces, and lack strong theoretical generalization guarantees.
A Nonlinear Predictive State Representation
Predictive state representations (PSRs) use predictions of a set of tests to represent the state of controlled dynamical systems. One reason why this representation is exciting as an alternative to partially observable Markov decision processes (POMDPs) is that PSR models of dynamical systems may be much more compact than POMDP models. Empirical work on PSRs to date has focused on linear PSRs, which have not allowed for compression relative to POMDPs. We introduce a new notion of tests which allows us to define a new type of PSR that is nonlinear in general and allows for exponential compression in some deterministic dynami- cal systems. These new tests, called e-tests, are related to the tests used by Rivest and Schapire [1] in their work with the diversity representation, but our PSR avoids some of the pitfalls of their representation--in partic- ular, its potential to be exponentially larger than the equivalent POMDP.
Applying Metric-Trees to Belief-Point POMDPs
Recent developments in grid-based and point-based approximation algo- rithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of belief points by individ- ually learning a value function for each point. In reality, belief points exist in a highly-structured metric simplex, but current POMDP algo- rithms do not exploit this property. This paper presents a new metric-tree algorithm which can be used in the context of POMDP planning to sort belief points spatially, and then perform fast value function updates over groups of points.
Simplicial Mixtures of Markov Chains: Distributed Modelling of Dynamic User Profiles
To provide a compact generative representation of the sequential activ- ity of a number of individuals within a group there is a tradeoff between the definition of individual specific and global models. This paper pro- poses a linear-time distributed model for finite state symbolic sequences representing traces of individual user activity by making the assump- tion that heterogeneous user behavior may be'explained' by a relatively small number of common structurally simple behavioral patterns which may interleave randomly in a user-specific proportion. The results of an empirical study on three different sources of user traces indicates that this modelling approach provides an efficient representation scheme, re- flected by improved prediction performance as well as providing low- complexity and intuitively interpretable representations.
Linear Program Approximations for Factored Continuous-State Markov Decision Processes
Approximate linear programming (ALP) has emerged recently as one of the most promising methods for solving complex factored MDPs with (cid:2)nite state spaces. In this work we show that ALP solutions are not limited only to MDPs with (cid:2)nite state spaces, but that they can also be applied successfully to factored continuous-state MDPs (CMDPs). We show how one can build an ALP-based approximation for such a model and contrast it to existing solution methods. We argue that this approach offers a robust alternative for solving high dimensional continuous-state space problems. The point is supported by experiments on three CMDP problems with 24-25 continuous state factors.