AITopics

Eigenvoice speaker adaptation has been shown to be effective when only a small amount of adaptation data is available. At the heart of the method is principal component analysis (PCA) employed to find the most important eigenvoices. In this paper, we postulate that nonlinear PCA, in particular kernel PCA, may be even more effective. One major challenge is to map the feature-space eigenvoices back to the observation space so that the state observation likelihoods can be computed during the estimation of eigenvoice weights and subsequent decoding. Our solution is to compute kernel PCA using composite kernels, and we will call our new method kernel eigenvoice speaker adaptation. On the TIDIGITS corpus, we found that compared with a speaker-independent model, our kernel eigenvoice adaptation method can reduce the word error rate by 28-33% while the standard eigenvoice approach can only match the performance of the speaker-independent model.

artificial intelligence, eigenvoice, machine learning, (12 more...)

Country: North America > United States > Wisconsin (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Sprague, Nathan, Ballard, Dana

Eye Movements for Reward Maximization

Recent eye tracking studies in natural tasks suggest that there is a tight link between eye movements and goal directed motor actions. However, most existing models of human eye movements provide a bottom up account that relates visual attention to attributes of the visual scene. The purpose of this paper is to introduce a new model of human eye movements that directly ties eye movements to the ongoing demands of behavior. The basic idea is that eye movements serve to reduce uncertainty about environmental variables that are task relevant. A value is assigned to an eye movement by estimating the expected cost of the uncertainty that will result if the movement is not made. If there are several candidate eye movements, the one with the highest expected value is chosen. The model is illustrated using a humanoid graphic figure that navigates on a sidewalk in a virtual urban environment. Simulations show our protocol is superior to a simple round robin scheduling mechanism.

artificial intelligence, eye movement, neural network, (19 more...)

Country: North America > United States > Texas (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Cognitive Science (0.88)
(3 more...)

Neill, Daniel B., Moore, Andrew W.

A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters

Given an N N grid of squares, where each square has a count and an underlying population, our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, can be used. For example, if each count represents the number of disease cases occurring in that square, we can use Kulldorff's spatial scan statistic D

algorithm, health & medicine, spatial reasoning, (18 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.46)

Andrews, Stuart, Hofmann, Thomas

Multiple Instance Learning via Disjunctive Programming Boosting

Learning from ambiguous training data is highly relevant in many applications. We present a new learning algorithm for classification problems where labels are associated with sets of pattern instead of individual patterns. This encompasses multiple instance learning as a special case. Our approach is based on a generalization of linear programming boosting and uses results from disjunctive programming to generate successively stronger linear relaxations of a discrete non-convex problem.

constraint, inductive learning, optimization problem, (18 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)

Tree-structured Approximations by Expectation Propagation

Qi, Yuan, Minka, Tom

Approximation structure plays an important role in inference on loopy graphs. As a tractable structure, tree approximations have been utilized in the variational method of Ghahramani & Jordan (1997) and the sequential projection method of Frey et al. (2000). However, belief propagation represents each factor of the graph with a product of single-node messages. In this paper, belief propagation is extended to represent factors with tree approximations, by way of the expectation propagation framework. That is, each factor sends a "message" to all pairs of nodes in a tree structure. The result is more accurate inferences and more frequent convergence than ordinary belief propagation, at a lower cost than variational trees or double-loop algorithms.

algorithm, artificial intelligence, bayesian inference, (16 more...)

Country:

Asia > Middle East > Jordan (0.25)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Felzenszwalb, Pedro F., Huttenlocher, Daniel P., Kleinberg, Jon M.

Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis

In applying Hidden Markov Models to the analysis of massive data streams, it is often necessary to use an artificially reduced set of states; this is due in large part to the fact that the basic HMM estimation algorithms have a quadratic dependence on the size of the state set. We present algorithms that reduce this computational bottleneck to linear or near-linear time, when the states can be embedded in an underlying grid of parameters. This type of state representation arises in many domains; in particular, we show an application to traffic analysis at a high-volume Web site.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Rudin, Cynthia, Daubechies, Ingrid, Schapire, Robert E.

On the Dynamics of Boosting

In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly be used to solve for Ada-Boost's output. By considering AdaBoost as a dynamical system, we are able to prove Rätsch and Warmuth's conjecture that AdaBoost may fail to converge to a maximal-margin combined classifier when given a'nonoptimal' weak learning algorithm.

adaboost, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)

Neal, Radford M., Beal, Matthew J., Roweis, Sam T.

Inferring State Sequences for Non-linear Systems with Embedded Hidden Markov Models

We describe a Markov chain method for sampling from the distribution of the hidden state sequence in a nonlinear dynamical system, given a sequence of observations. This method updates all states in the sequence simultaneously using an embedded Hidden Markov Model (HMM). An update begins with the creation of "pools" of candidate states at each time. We then define an embedded HMM whose states are indexes within these pools. Using a forward-backward dynamic programming algorithm, we can efficiently choose a state sequence with the appropriate probabilities from the exponentially large number of state sequences that pass through states in these pools. We illustrate the method in a simple one-dimensional example, and in an example showing how an embedded HMM can be used to in effect discretize the state space without any discretization error. We also compare the embedded HMM to a particle smoother on a more substantial problem of inferring human motion from 2D traces of markers.

artificial intelligence, machine learning, sequence, (19 more...)

Country: North America > Canada > Ontario > Toronto (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Weston, Jason, Zhou, Dengyong, Elisseeff, André, Noble, William S., Leslie, Christina S.

Semi-supervised Protein Classification Using Cluster Kernels

A key issue in supervised protein classification is the representation of input sequencesof amino acids. Recent work using string kernels for protein datahas achieved state-of-the-art classification performance. However, suchrepresentations are based only on labeled data -- examples with known 3D structures, organized into structural classes -- while in practice, unlabeled data is far more plentiful.

health & medicine, inductive learning, kernel, (19 more...)

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.15)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)

Werfel, Justin, Xie, Xiaohui, Seung, H. S.

Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks

Gradient-following learning methods can encounter problems of implementation inmany applications, and stochastic variants are frequently used to overcome these difficulties. We derive quantitative learning curves for three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. The maximum learning rate for the stochastic methods scales inversely with the first power of the dimensionality of the noise injected into the system; withsufficiently small learning rate, all three methods give identical learning curves. These results suggest guidelines for when these stochastic methods will be limited in their utility, and considerations for architectures in which they will be effective.

artificial intelligence, neural network, perturbation, (20 more...)