AITopics

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kandola, Jaz, Cristianini, Nello, Shawe-taylor, John S.

Learning Semantic Similarity

The standard representation of text documents as bags of words suffers from well known limitations, mostly due to its inability to exploit semantic similarity between terms. Attempts to incorporate somenotion of term similarity include latent semantic indexing [8], the use of semantic networks [9], and probabilistic methods [5]. In this paper we propose two methods for inferring such similarity froma corpus. The first one defines word-similarity based on document-similarity and viceversa, giving rise to a system of equations whose equilibrium point we use to obtain a semantic similarity measure. The second method models semantic relations by means of a diffusion process on a graph defined by lexicon and co-occurrence information.

artificial intelligence, kernel, text processing, (18 more...)

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Roy, Nicholas, Gordon, Geoffrey J.

Exponential Family PCA for Belief Compression in POMDPs

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are intractable for large models. The intractability ofthese algorithms is due to a great extent to their generating an optimal policy over the entire belief space. However, in real POMDP problems most belief states are unlikely, and there is a structured, low-dimensional manifold of plausible beliefs embedded in the high-dimensional belief space. We introduce a new method for solving large-scale POMDPs by taking advantage of belief space sparsity. We reduce the dimensionality of the belief space by exponential family Principal Components Analysis [1], which allows us to turn the sparse, highdimensional beliefspace into a compact, low-dimensional representation in terms of learned features of the belief state. We then plan directly on the low-dimensional belief features. By planning in a low-dimensional space, we can find policies for POMDPs that are orders of magnitude larger than can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and also on a mobile robot navigation task.

artificial intelligence, belief space, machine learning, (17 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Real-Time Particle Filters

Kwok, Cody, Fox, Dieter, Meila, Marina

Particle filters estimate the state of dynamical systems from sensor information.

artificial intelligence, machine learning, particle filter, (18 more...)

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.79)

Steck, Harald, Jaakkola, Tommi S.

On the Dirichlet Prior and Bayesian Regularization

In the Bayesian approach, regularizationis achieved by specifying a prior distribution over the parameters and subsequently averaging over the posterior distribution. This regularization provides not only smoother estimates of the parameters compared to maximum likelihood but also guides the selection of model structures. It was pointed out in [6] that a very large scale parameter of the Dirichlet prior can degrade predictive accuracy due to severe regularization of the parameter estimates. We complement this discussion here and show that a very small scale parameter can lead to poor over-regularized structures when a product of (conjugate) Dirichlet priors is used over multinomial conditional distributions (Section 3). Section 4 demonstrates the effect of the scale parameter and how it can be calibrated. We focus on the class of Bayesian network models throughout this paper.

Grimes, David B., Rao, Rajesh P. N.

A Bilinear Model for Sparse Coding

Recent algorithms for sparse coding and independent component analysis (ICA)have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations intoaccount. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image featuresand their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision.We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, therebyproviding a new framework for parts-based object recognition.

artificial intelligence, health & medicine, transformation, (18 more...)

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.97)

Adaptive Quantization and Density Estimation in Silicon

Hsu, David, Bridges, Seth, Figueroa, Miguel, Diorio, Chris

We present the bump mixture model, a statistical model for analog data where the probabilistic semantics, inference, and learning rules derive from low-level transistor behavior. The bump mixture model relies on translinear circuits to perform probabilistic inference, andfloating-gate devices to perform adaptation. This system is low power, asynchronous, and fully parallel, and supports various on-chiplearning algorithms. In addition, the mixture model can perform several tasks such as probability estimation, vector quantization, classification,and clustering. We tested a fabricated system on clustering, quantization, and classification of handwritten digits and show performance comparable to the EM algorithm on mixtures ofGaussians.

artificial intelligence, machine learning, mixture model, (16 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Welling, Max, Zemel, Richard S., Hinton, Geoffrey E.

Self Supervised Boosting

Boosting algorithms and successful applications thereof abound for classification andregression learning problems, but not for unsupervised learning. We propose a sequential approach to adding features to a random fieldmodel by training them to improve classification performance between the data and an equal-sized sample of "negative examples" generated fromthe model's current estimate of the data density.

algorithm, artificial intelligence, machine learning, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Watanabe, Shinji, Minami, Yasuhiro, Nakamura, Atsushi, Ueda, Naonori

Application of Variational Bayesian Approach to Speech Recognition

Application of V ariational Bayesian Approach to Speech Recognition Shinji Watanabe, Y asuhiro Minami, Atsushi Nakamura and Naonori Ueda NTT Communication Science Laboratories, NTT Corporation 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan {watanabe,minami,ats,ueda}@cslab.kecl.ntt.co.jp Abstract In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classification; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufficient amounts of data, because it does not use an asymptotic assumption. In acoustic modeling, a triphone-based hidden Markov model (triphone HMM) has been widely employed. The triphone is a context dependent phoneme unit that considers both the preceding and following phonemes.

artificial intelligence, bayesian inference, training data, (15 more...)

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.24)

Genre: Research Report (0.46)

Industry: Information Technology (0.74)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Smola, Alex J., Vishwanathan, S.v.n.

Fast Kernels for String and Tree Matching

In this paper we present a new algorithm suitable for matching discrete objects such as strings and trees in linear time, thus obviating dynarrtic programming with quadratic time complexity. Furthermore, prediction cost in many cases can be reduced to linear cost in the length of the sequence tobe classified, regardless of the number of support vectors. This improvement on the currently available algorithms makes string kernels a viable alternative for the practitioner.

artificial intelligence, kernel, natural language, (19 more...)