AITopics

We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is different both from Jaynes maximum entropy principle and from standard maximum likelihood estimation.We demonstrate the LME principle BY deriving new algorithms for Boltzmann machine parameter estimation, and show how robust and fast new variant of the EM algorithm can be developed.Our experiments show that estimation based on LME generally yields better results than maximum likelihood estimation, particularly when inferring hidden units from small amounts of data.

algorithm, artificial intelligence, bayesian inference, (17 more...)

1212.2514

Country:

North America > United States (0.46)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Drton, Mathias, Richardson, Thomas S.

A New Algorithm for Maximum Likelihood Estimation in Gaussian Graphical Models for Marginal Independence

Graphical models with bi-directed edges (<->) represent marginal independence: the absence of an edge between two vertices indicates that the corresponding variables are marginally independent. In this paper, we consider maximum likelihood estimation in the case of continuous variables with a Gaussian joint distribution, sometimes termed a covariance graph model. We present a new fitting algorithm which exploits standard regression techniques and establish its convergence properties. Moreover, we contrast our procedure to existing estimation methods.

algorithm, artificial intelligence, bayesian inference, (15 more...)

1212.2462

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Ohio (0.14)
North America > United States > North Carolina (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Rusakov, Dmitry, Geiger, Dan

Automated Analytic Asymptotic Evaluation of the Marginal Likelihood for Latent Models

We present and implement two algorithms for analytic asymptotic evaluation of the marginal likelihood of data given a Bayesian network with hidden nodes. As shown by previous work, this evaluation is particularly hard for latent Bayesian network models, namely networks that include hidden variables, where asymptotic approximation deviates from the standard BIC score. Our algorithms solve two central difficulties in asymptotic evaluation of marginal likelihood integrals, namely, evaluation of regular dimensionality drop for latent Bayesian network models and computation of non-standard approximation formulas for singular statistics for these models. The presented algorithms are implemented in Matlab and Maple and their usage is demonstrated for marginal likelihood approximations for Bayesian networks with hidden variables.

artificial intelligence, bayesian inference, bayesian network, (17 more...)

1212.2491

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

A Distance-Based Branch and Bound Feature Selection Algorithm

Frank, Ari, Geiger, Dan, Yakhini, Zohar

There is no known efficient method for selecting k Gaussian features from n which achieve the lowest Bayesian classification error. We show an example of how greedy algorithms faced with this task are led to give results that are not optimal. This motivates us to propose a more robust approach. We present a Branch and Bound algorithm for finding a subset of k independent Gaussian features which minimizes the naive Bayesian classification error. Our algorithm uses additive monotonic distance measures to produce bounds for the Bayesian classification error in order to exclude many feature subsets from evaluation, while still returning an optimal solution. We test our method on synthetic data as well as data obtained from gene expression profiling.

health & medicine, oncology, subset, (18 more...)

1212.2488

Country: North America > United States > California > San Diego County (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.89)
Health & Medicine > Therapeutic Area > Oncology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Bishop, Christopher M., Svensen, Markus

Bayesian Hierarchical Mixtures of Experts

The Hierarchical Mixture of Experts (HME) is a well-known tree-based model for regression and classification, based on soft probabilistic splits. In its original formulation it was trained by maximum likelihood, and is therefore prone to over-fitting. Furthermore the maximum likelihood framework offers no natural metric for optimizing the complexity and structure of the tree. Previous attempts to provide a Bayesian treatment of the HME model have relied either on ad-hoc local Gaussian approximations or have dealt with related models representing the joint distribution of both input and output variables. In this paper we describe a fully Bayesian treatment of the HME model based on variational inference. By combining local and global variational methods we obtain a rigourous lower bound on the marginal probability of the data under the model. This bound is optimized during the training phase, and its resulting value can be used for model order selection. We present results using this approach for a data set describing robot arm kinematics.

artificial intelligence, bayesian inference, hme model, (16 more...)

1212.2447

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Yeang, Chen-Hsiang, Szummer, Martin

Markov Random Walk Representations with Continuous Distributions

Representations based on random walks can exploit discrete data distributions for clustering and classification. We extend such representations from discrete to continuous distributions. Transition probabilities are now calculated using a diffusion equation with a diffusion coefficient that inversely depends on the data density. We relate this diffusion equation to a path integral and derive the corresponding path probability measure. The framework is useful for incorporating continuous data densities and prior knowledge.

artificial intelligence, data mining, machine learning, (13 more...)

1212.251

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes

Yu, Kai, Schwaighofer, Anton, Tresp, Volker, Ma, Wei-Ying, Zhang, HongJiang

Collaborative filtering (CF) and content-based filtering (CBF) have widely been used in information filtering applications. Both approaches have their strengths and weaknesses which is why researchers have developed hybrid systems. This paper proposes a novel approach to unify CF and CBF in a probabilistic framework, named collaborative ensemble learning. It uses probabilistic SVMs to model each user's profile (as CBF does).At the prediction phase, it combines a society OF users profiles, represented by their respective SVM models, to predict an active users preferences(the CF idea).The combination scheme is embedded in a probabilistic framework and retains an intuitive explanation.Moreover, collaborative ensemble learning does not require a global training stage and thus can incrementally incorporate new data.We report results based on two data sets. For the Reuters-21578 text data set, we simulate user ratings under the assumption that each user is interested in only one category. In the second experiment, we use users' opinions on a set of 642 art images that were collected through a web-based survey. For both data sets, collaborative ensemble achieved excellent performance in terms of recommendation accuracy.

artificial intelligence, data mining, information, (18 more...)

1212.2508

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.50)

Welling, Max, Zemel, Richard S., Hinton, Geoffrey E.

Efficient Parametric Projection Pursuit Density Estimation

Product models of low dimensional experts are a powerful way to avoid the curse of dimensionality. We present the ``under-complete product of experts' (UPoE), where each expert models a one dimensional projection of the data. The UPoE is fully tractable and may be interpreted as a parametric probabilistic model for projection pursuit. Its ML learning rules are identical to the approximate learning rules proposed before for under-complete ICA. We also derive an efficient sequential learning algorithm and discuss its relationship to projection pursuit density estimation and feature induction algorithms for additive random field models.

artificial intelligence, machine learning, projection, (16 more...)

1212.2513

Country: North America > Canada > Ontario > Toronto (0.16)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Learning Module Networks

Segal, Eran, Pe'er, Dana, Regev, Aviv, Koller, Daphne, Friedman, Nir

Methods for learning Bayesian network structure can discover dependency structure between observed variables, and have been shown to be useful in many applications. However, in domains that involve a large number of variables, the space of possible network structures is enormous, making it difficult, for both computational and statistical reasons, to identify a good model. In this paper, we consider a solution to this problem, suitable for domains where many variables have similar behavior. Our method is based on a new class of models, which we call module networks. A module network explicitly represents the notion of a module - a set of variables that have the same parents in the network and share the same conditional probability distribution. We define the semantics of module networks, and describe an algorithm that learns a module network from data. The algorithm learns both the partitioning of the variables into modules and the dependency structure between the variables. We evaluate our algorithm on synthetic data, and on real data in the domains of gene expression and the stock market. Our results show that module networks generalize better than Bayesian networks, and that the learned module network structure reveals regularities that are obscured in learned Bayesian networks.

bayesian inference, health & medicine, module, (20 more...)

1212.2517

Country:

Asia > Middle East > Israel (0.28)
North America > United States > California > Santa Clara County (0.14)

Genre:

Research Report > New Finding (0.86)
Instructional Material > Course Syllabus & Notes (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Frank, Eibe, Hall, Mark, Pfahringer, Bernhard

Locally Weighted Naive Bayes

Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes primary weakness - attribute independence - and improve the performance of the algorithm. This paper presents a locally weighted version of naive Bayes that relaxes the independence assumption by learning local models at prediction time. Experimental results show that locally weighted naive Bayes rarely degrades accuracy compared to standard naive Bayes and, in many cases, improves accuracy dramatically. The main advantage of this method compared to other techniques for enhancing naive Bayes is its conceptual and computational simplicity.

artificial intelligence, health & medicine, naive bayes, (16 more...)

1212.2487

Country: Oceania > New Zealand > North Island > Waikato (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine (0.48)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)