AITopics

Dirichlet Process (DP) mixture models are promising candidates for clustering applications where the number of clusters is unknown a priori. Due to computational considerations these models are unfortunately unsuitable for large scale data-mining applications. We propose a class of deterministic accelerated DP mixture models that can routinely handle millions of data-cases. The speedup is achieved by incorporating kd-trees into a variational Bayesian algorithm for DP mixtures in the stick-breaking representation, similar to that of Blei and Jordan (2005). Our algorithm differs in the use of kd-trees and in the way we handle truncation: we only assume that the variational distributions are fixed at their priors after a certain level. Experiments show that speedups relative to the standard variational algorithm can be significant.

algorithm, artificial intelligence, machine learning, (14 more...)

Country:

Asia > Middle East > Jordan (0.25)
Asia > Japan (0.14)
North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Bayesian Model Scoring in Markov Random Fields

Parise, Sridevi, Welling, Max

Scoring structures of undirected graphical models by means of evaluating the marginal likelihood is very hard. The main reason is the presence of the partition function which is intractable to evaluate, let alone integrate over. We propose to approximate the marginal likelihood by employing two levels of approximation: we assume normality of the posterior (the Laplace approximation) and approximate all remaining intractable quantities using belief propagation and the linear response approximation.

Accelerated Variational Dirichlet Process Mixtures

Kurihara, Kenichi, Welling, Max, Vlassis, Nikos

Dirichlet Process (DP) mixture models are promising candidates for clustering applications where the number of clusters is unknown a priori. Due to computational considerations these models are unfortunately unsuitable for large scale data-mining applications. We propose a class of deterministic accelerated DP mixture models that can routinely handle millions of data-cases. The speedup is achieved by incorporating kd-trees into a variational Bayesian algorithm for DP mixtures in the stick-breaking representation, similar to that of Blei and Jordan (2005). Our algorithm differs in the use of kd-trees and in the way we handle truncation: we only assume that the variational distributions are fixed at their priors after a certain level. Experiments show that speedups relative to the standard variational algorithm can be significant.

algorithm, artificial intelligence, machine learning, (14 more...)

Country:

Asia > Middle East > Jordan (0.25)
Asia > Japan (0.14)
North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation

Teh, Yee W., Newman, David, Welling, Max

Latent Dirichlet allocation (LDA) is a Bayesian network that has recently gained much popularity in applications ranging from document modeling to computer vision. Due to the large scale nature of these applications, current inference procedures like variational Bayes and Gibbs sampling have been found lacking. In this paper we propose the collapsed variational Bayesian inference algorithm for LDA, and show that it is computationally efficient, easy to implement and significantly more accurate than standard variational Bayesian inference for LDA.

artificial intelligence, bayesian inference, latent variable, (17 more...)

Country: North America > United States > California > Orange County > Irvine (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Model Scoring in Markov Random Fields

Parise, Sridevi, Welling, Max

Scoring structures of undirected graphical models by means of evaluating the marginal likelihood is very hard. The main reason is the presence of the partition functionwhich is intractable to evaluate, let alone integrate over. We propose to approximate the marginal likelihood by employing two levels of approximation: we assume normality of the posterior (the Laplace approximation) and approximate allremaining intractable quantities using belief propagation and the linear response approximation.

Neural Information Processing SystemsDec-31-2006

Products of ``Edge-perts

Welling, Max, Gehler, Peter V.

Images represent an important and abundant source of data. Understanding theirstatistical structure has important applications such as image compression and restoration. In this paper we propose a particular kind of probabilistic model, dubbed the "products of edge-perts model" to describe thestructure of wavelet transformed images. We develop a practical denoising algorithm based on a single edge-pert and show state-ofthe-art denoisingperformance on benchmark images.

artificial intelligence, coefficient, wavelet coefficients, (16 more...)

Country:

North America > United States > California (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Neural Information Processing SystemsDec-31-2005

Exponential Family Harmoniums with an Application to Information Retrieval

Welling, Max, Rosen-zvi, Michal, Hinton, Geoffrey E.

Inference in these "exponential family harrnoniums" is

artificial intelligence, bayesian inference, latent variable, (16 more...)

Country:

North America > United States > California (0.28)
North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Neural Information Processing SystemsDec-31-2004

Linear Response for Approximate Inference

Welling, Max, Teh, Yee W.

Belief propagation on cyclic graphs is an efficient algorithm for computing approximate marginal probability distributions over single nodes and neighboring nodes in the graph. In this paper we propose two new algorithms for approximating joint probabilities of arbitrary pairs of nodes and prove a number of desirable properties that these estimates fulfill. The first algorithm is a propagation algorithm which is shown to converge if belief propagation converges to a stable fixed point. The second algorithm is based on matrix inversion. Experiments compare a number of competing methods.

algorithm, artificial intelligence, belief revision, (18 more...)

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.28)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.57)

Neural Information Processing SystemsDec-31-2004

Extreme Components Analysis

Welling, Max, Williams, Christopher, Agakov, Felix V.

Principal components analysis (PCA) is one of the most widely used techniques in machine learning and data mining. Minor components analysis (MCA) is less well known, but can also play an important role in the presence of constraints on the data distribution. In this paper we present a probabilistic model for "extreme components analysis" (XCA) which at the maximum likelihood solution extracts an optimal combination of principal and minor components. For a given number of components, the log-likelihood of the XCA model is guaranteed to be larger or equal than that of the probabilistic models for PCA and MCA. We describe an efficient algorithm to solve for the globally optimal solution. For log-convex spectra we prove that the solution consists of principal components only, while for log-concave spectra the solution consists of minor components. In general, the solution admits a combination of both. In experiments we explore the properties of XCA on some synthetic and real-world datasets.

artificial intelligence, eigenvalue, machine learning, (16 more...)

Country: North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)

Neural Information Processing SystemsDec-31-2004

Extreme Components Analysis

Welling, Max, Williams, Christopher, Agakov, Felix V.

Principal components analysis (PCA) is one of the most widely used techniques in machine learning and data mining. Minor components analysis (MCA) is less well known, but can also play an important role in the presence of constraints on the data distribution. In this paper we present a probabilistic model for "extreme components analysis" (XCA) which at the maximum likelihood solution extracts an optimal combination ofprincipal and minor components. For a given number of components, thelog-likelihood of the XCA model is guaranteed to be larger or equal than that of the probabilistic models for PCA and MCA. We describe anefficient algorithm to solve for the globally optimal solution. For log-convex spectra we prove that the solution consists of principal components only, while for log-concave spectra the solution consists of minor components. In general, the solution admits a combination of both. In experiments we explore the properties of XCA on some synthetic and real-world datasets.

artificial intelligence, eigenvalue, machine learning, (16 more...)

Country: North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)