AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Perspectives on Sparse Bayesian Learning

Palmer, Jason, Rao, Bhaskar D., Wipf, David P.

Neural Information Processing SystemsDec-31-2004

Recently, relevance vector machines (RVM) have been fashioned from a sparse Bayesian learning (SBL) framework to perform supervised learning usinga weight prior that encourages sparsity of representation. The methodology incorporates an additional set of hyperparameters governing theprior, one for each weight, and then adopts a specific approximation tothe full marginalization over all weights and hyperparameters. Despite its empirical success however, no rigorous motivation for this particular approximation is currently available. To address this issue, we demonstrate that SBL can be recast as the application of a rigorous variational approximationto the full model by expressing the prior in a dual form. This formulation obviates the necessity of assuming any hyperpriors andleads to natural, intuitive explanations of why sparsity is achieved in practice.

approximation, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.72)

Add feedback

Hierarchical Topic Models and the Nested Chinese Restaurant Process

Griffiths, Thomas L., Jordan, Michael I., Tenenbaum, Joshua B., Blei, David M.

Neural Information Processing SystemsDec-31-2004

We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting--which of the large collection of possible trees to use? We take a Bayesian approach, generating anappropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarilylarge branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.46)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Consumer Products & Services > Restaurants (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms

Derbeko, P., El-Yaniv, R., Meir, R.

Journal of Artificial Intelligence ResearchOct-1-2004

Inductive learning is based on inferring a general rule from a finite data set and using it to label new data. In transduction one attempts to solve the problem of using a labeled training set to label a set of unlabeled points, which are given to the learner prior to learning. Although transduction seems at the outset to be an easier task than induction, there have not been many provably useful algorithms for transduction. Moreover, the precise relation between induction and transduction has not yet been determined. The main theoretical developments related to transduction were presented by Vapnik more than twenty years ago. One of Vapnik's basic results is a rather tight error bound for transductive classification based on an exact computation of the hypergeometric tail. While tight, this bound is given implicitly via a computational routine. Our first contribution is a somewhat looser but explicit characterization of a slightly extended PAC-Bayesian version of Vapnik's transductive bound. This characterization is obtained using concentration inequalities for the tail of sums of random variables obtained by sampling without replacement. We then derive error bounds for compression schemes such as (transductive) support vector machines and for transduction algorithms based on clustering. The main observation used for deriving these new error bounds and algorithms is that the unlabeled test points, which in the transductive setting are known in advance, can be used in order to construct useful data dependent prior distributions over the hypothesis space.

algorithm, transduction, vapnik, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1417

AI Access Foundation

10387

Journal of Artificial Intelligence Research

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Ordinal and Probabilistic Representations of Acceptance

Dubois, D., Fargier, H., Prade, H.

Journal of Artificial Intelligence ResearchJul-1-2004

An accepted belief is a proposition considered likely enough by an agent, to be inferred from as if it were true. This paper bridges the gap between probabilistic and logical representations of accepted beliefs. To this end, natural properties of relations on propositions, describing relative strength of belief are augmented with some conditions ensuring that accepted beliefs form a deductively closed set. This requirement turns out to be very restrictive. In particular, it is shown that the sets of accepted belief of an agent can always be derived from a family of possibility rankings of states. An agent accepts a proposition in a given context if this proposition is considered more possible than its negation in this context, for all possibility rankings in the family. These results are closely connected to the non-monotonic 'preferential' inference system of Kraus, Lehmann and Magidor and the so-called plausibility functions of Friedman and Halpern. The extent to which probability theory is compatible with acceptance relations is laid bare. A solution to the lottery paradox, which is considered as a major impediment to the use of non-monotonic inference is proposed using a special kind of probabilities (called lexicographic, or big-stepped). The setting of acceptance relations also proposes another way of approaching the theory of belief change after the works of GÃÂ¤rdenfors and colleagues. Our view considers the acceptance relation as a primitive object from which belief sets are derived in various contexts.

acceptance relation, confidence relation, relation, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1265

AI Access Foundation

10383

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(15 more...)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)
(2 more...)

Add feedback

Using Machine Learning to Design and Interpret Gene-Expression Microarrays

Molla, Michael, Waddell, Michael, Page, David, Shavlik, Jude

AI MagazineMar-15-2004

Gene-expression microarrays, commonly called gene chips, make it possible to simultaneously measure the rate at which a cell or tissue is expressing -- translating into a protein -- each of its thousands of genes. One can use these comprehensive snapshots of biological activity to infer regulatory pathways in cells; identify novel targets for drug design; and improve the diagnosis, prognosis, and treatment planning for those suffering from disease. However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data. It also reviews some of the recent prominent applications of machine learning to gene-chip data, points to related tasks where machine learning might have a further impact on biology and medicine, and describes additional types of interesting data that recent advances in biotechnology allow biomedical researchers to collect.

bioinformatics, experiment, machine learning, (18 more...)

AI Magazine

Country: North America > United States > California (0.28)

Genre: Overview (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(2 more...)

Add feedback

Distribution of Mutual Information from Complete and Incomplete Data

Hutter, Marcus, Zaffalon, Marco

arXiv.org Artificial IntelligenceMar-15-2004

Mutual information is widely used, in a descriptive way, to measure the stochastic dependence of categorical random variables. In order to address questions such as the reliability of the descriptive value, one must consider sample-to-population inferential approaches. This paper deals with the posterior distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean, and analytical approximations for the variance, skewness and kurtosis are derived. These approximations have a guaranteed accuracy level of the order O(1/n^3), where n is the sample size. Leading order approximations for the mean and the variance are derived in the case of incomplete samples. The derived analytical expressions allow the distribution of mutual information to be approximated reliably and quickly. In fact, the derived expressions can be computed with the same order of complexity needed for descriptive mutual information. This makes the distribution of mutual information become a concrete alternative to descriptive mutual information in many applications which would benefit from moving to the inductive side. Some of these prospective applications are discussed, and one of them, namely feature selection, is shown to perform significantly better when inductive mutual information is used.

artificial intelligence, machine learning, mutual information, (15 more...)

arXiv.org Artificial Intelligence

cs/0403025

Country: North America > United States (0.93)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Representation Dependence in Probabilistic Inference

Halpern, J. Y., Koller, D.

Journal of Artificial Intelligence ResearchMar-1-2004

Non-deductive reasoning systems are often representation dependent: representing the same situation in two different ways may cause such a system to return two different answers. Some have viewed this as a significant problem. For example, the principle of maximum entropyhas been subjected to much criticism due to its representation dependence. There has, however, been almost no work investigating representation dependence. In this paper, we formalize this notion and show that it is not a problem specific to maximum entropy. In fact, we show that any representation-independent probabilistic inference procedure that ignores irrelevant information is essentially entailment, in a precise sense. Moreover, we show that representation independence is incompatible with even a weak default assumption of independence. We then show that invariance under a restricted class of representation changes can form a reasonable compromise between representation independence and other desiderata, and provide a construction of a family of inference procedures that provides such restricted representation independence, using relative entropy.

constraint, inference procedure, representation independence, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1292

AI Access Foundation

10373

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Competitive Coevolution through Evolutionary Complexification

Stanley, K. O., Miikkulainen, R.

Journal of Artificial Intelligence ResearchFeb-1-2004

Two major goals in machine learning are the discovery and improvement of solutions to complex problems. In this paper, we argue that complexification, i.e. the incremental elaboration of solutions through adding new structure, achieves both these goals. We demonstrate the power of complexification through the NeuroEvolution of Augmenting Topologies (NEAT) method, which evolves increasingly complex neural network architectures. NEAT is applied to an open-ended coevolutionary robot duel domain where robot controllers compete head to head. Because the robot duel domain supports a wide range of strategies, and because coevolution benefits from an escalating arms race, it serves as a suitable testbed for studying complexification. When compared to the evolution of networks with fixed structure, complexifying evolution discovers significantly more sophisticated strategies. The results suggest that in order to discover and improve complex solutions, evolution, and search in general, should be allowed to complexify as well as optimize.

coevolution, complexification, dominant strategy, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1338

AI Access Foundation

10367

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(3 more...)

Add feedback

Effective Dimensions of Hierarchical Latent Class Models

Zhang, N. L., Kocka, T.

Journal of Artificial Intelligence ResearchJan-1-2004

Hierarchical latent class (HLC) models are tree-structured Bayesian networks where leaf nodes are observed while internal nodes are latent. There are no theoretically well justified model selection criteria for HLC models in particular and Bayesian networks with latent nodes in general. Nonetheless, empirical studies suggest that the BIC score is a reasonable criterion to use in practice for learning HLC models. Empirical studies also suggest that sometimes model selection can be improved if standard model dimension is replaced with effective model dimension in the penalty term of the BIC score. Effective dimensions are difficult to compute. In this paper, we prove a theorem that relates the effective dimension of an HLC model to the effective dimensions of a number of latent class models. The theorem makes it computationally feasible to compute the effective dimensions of large HLC models. The theorem can also be used to compute the effective dimensions of general tree models.

dimension, effective dimension, hlc model, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1311

AI Access Foundation

10365

Journal of Artificial Intelligence Research

Country:

Asia > China > Hong Kong (0.04)
Europe > Czechia > Prague (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)

Add feedback

Combining Dimensions and Features in Similarity-Based Representations

Navarro, Daniel J., Lee, Michael D.

Neural Information Processing SystemsDec-31-2003

The dimensional and featural approaches have different strengths and weaknesses. Dimensional representations are constrained by the metric axioms, such as the tri-- angle inequality, that are violated by some empirical data.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Ohio (0.04)
North America > United States > New York (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback