AITopics

The choice of an SVM kernel corresponds to the choice of a representation of the data in a feature space and, to improve performance, it should therefore incorporate prior knowledge such as known transformation invariances. We propose a technique which extends earlier work and aims at incorporating invariances in nonlinear kernels. We show on a digit recognition task that the proposed approach is superior to the Virtual Support Vector method, which previously had been the method of choice. 1 Introduction In some classification tasks, an a priori knowledge is known about the invariances related to the task. For instance, in image classification, we know that the label of a given image should not change after a small translation or rotation.

invariance, tangent vector, vector, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Blei, David M., Ng, Andrew Y., Jordan, Michael I.

Latent Dirichlet Allocation

We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hofmann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms.

lda, likelihood, probability, (16 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > New York (0.05)
Asia > Middle East > Jordan (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Bi, J., Bennett, Kristin P.

Duality, Geometry, and Support Vector Regression

We develop an intuitive geometric framework for support vector regression (SVR). By examining when ɛ-tubes exist, we show that SVR can be regarded as a classification problem in the dual space. Hard and soft ɛ-tubes are constructed by separating the convex or reduced convex hulls respectively of the training data with the response variable shifted up and down by ɛ. A novel SVR model is proposed based on choosing the max-margin plane between the two shifted datasets.

convex hull, plane, rc-svr, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Belkin, Mikhail, Niyogi, Partha

Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering

Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering.

graph, laplacian, manifold, (12 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Data Science > Data Mining (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Beal, Matthew J., Ghahramani, Zoubin, Rasmussen, Carl E.

The Infinite Hidden Markov Model

We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infinite-- consider, for example, symbols being possible words appearing in English text.

matrix, sequence, transition, (17 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Bach, Francis R., Jordan, Michael I.

Thin Junction Trees

We present an algorithm that induces a class of models with thin junction trees--models that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the learning process. This allows both an efficient implementation of an iterative scaling parameter estimation algorithm and also ensures that inference can be performed efficiently with the final model. We illustrate the approach with applications in handwritten digit recognition and DNA splice site detection.

algorithm, junction tree, thin junction tree, (14 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Andrieu, Christophe, Freitas, Nando D., Doucet, Arnaud

Rao-Blackwellised Particle Filtering via Data Augmentation

SMC is often referred to as particle filtering (PF) in the context of computing filtering distributions for statistical inference and learning. It is known that the performance of PF often deteriorates in high-dimensional state spaces. In the past, we have shown that if a model admits partial analytical tractability, it is possible to combine PF with exact algorithms (Kalman filters, HMM filters, junction tree algorithm) to obtain efficient high dimensional filters (Doucet, de Freitas, Murphy and Russell 2000, Doucet, Godsill and Andrieu 2000). In particular, we exploited a marginalisation technique known as Rao-Blackwellisation (RB). Here, we attack a more complex model that does not admit immediate analytical tractability.

algorithm, doucet, particle, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Generalization Performance of Some Learning Problems in Hilbert Functional Spaces

Zhang, T.

We investigate the generalization performance of some learning problems in Hilbert functional Spaces. We introduce a notion of convergence of the estimated functional predictor to the best underlying predictor, and obtain an estimate on the rate of the convergence. This estimate allows us to derive generalization bounds on some learning formulations.

inequality, probability, theorem 3, (14 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Focused Education > Special Education (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Fast Parameter Estimation Using Green's Functions

Wong, K., Li, F.

It is well known that correct choices of hyperparameters in classification and regression tasks can optimize the complexity of the data model, and hence achieve the best generalization [1]. In recent years various methods have been proposed to estimate the optimal hyperparameters in different contexts, such as neural networks [2], support vector machines [3, 4, 5] and Gaussian processes [5]. Most of these methods are inspired by the technique of cross-validation or its variant, leave-one-out validation. While the leave-one-out procedure gives an almost unbiased estimate of the generalization error, it is nevertheless very tedious. Many of the mentioned attempts aimed at approximating this tedious procedure without really having to sweat through it.

cavity method, generalization error, green, (13 more...)

Country:

Asia > China > Hong Kong (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Tanaka, Toshiyuki, Ikeda, Shiro, Amari, Shun-ichi

Information-Geometrical Significance of Sparsity in Gallager Codes

We report a result of perturbation analysis on decoding error of the belief propagation decoder for Gallager codes. The analysis is based on information geometry, and it shows that the principal term of decoding error at equilibrium comes from the m-embedding curvature of the log-linear submanifold spanned by the estimated pseudoposteriors, one for the full marginal, and K for partial posteriors, each of which takes a single check into account, where K is the number of checks in the Gallager code. It is then shown that the principal error term vanishes when the parity-check matrix of the code is so sparse that there are no two columns with overlap greater than 1.

belief propagation decoder, gallager code, parity-check matrix, (9 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)