AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Choice of neighbor order in nearest-neighbor classification

Hall, Peter, Park, Byeong U., Samworth, Richard J.

arXiv.org Machine LearningOct-29-2008

The $k$th-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method is inhibited by lack of knowledge about its properties, in particular, about the manner in which it is influenced by the value of $k$; and by the absence of techniques for empirical choice of $k$. In the present paper we detail the way in which the value of $k$ determines the misclassification error. We consider two models, Poisson and Binomial, for the training samples. Under the first model, data are recorded in a Poisson stream and are "assigned" to one or other of the two populations in accordance with the prior probabilities. In particular, the total number of data in both training samples is a Poisson-distributed random variable. Under the Binomial model, however, the total number of data in the training samples is fixed, although again each data value is assigned in a random way. Although the values of risk and regret associated with the Poisson and Binomial models are different, they are asymptotically equivalent to first order, and also to the risks associated with kernel-based classifiers that are tailored to the case of two derivatives. These properties motivate new methods for choosing the value of $k$.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1214/07-AOS537

0810.5276

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.31)

Add feedback

Statistical Learning Theory: Models, Concepts, and Results

von Luxburg, Ulrike, Schoelkopf, Bernhard

arXiv.org Machine LearningOct-27-2008

Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms and is arguably one of the most beautifully developed branches of artificial intelligence in general. It originated in Russia in the 1960s and gained wide popularity in the 1990s following the development of the so-called Support Vector Machine (SVM), which has become a standard tool for pattern recognition in a variety of domains ranging from computer vision to computational biology. Providing the basis of new learning algorithms, however, was not the only motivation for developing statistical learning theory. It was just as much a philosophical one, attempting to answer the question of what it is that allows us to draw valid conclusions from empirical data. In this article we attempt to give a gentle, nontechnical overview over the key ideas and insights of statistical learning theory. We do not assume that the reader has a deep background in mathematics, statistics, or computer science. Given the nature of the subject matter, however, some familiarity with mathematical concepts and notations and some intuitive understanding of basic probability is required. There exist many excellent references to more technical surveys of the mathematics of statistical learning theory: the monographs by one of the founders of statistical learning theory (Vapnik, 1995, Vapnik, 1998), a brief overview over statistical learning theory in Section 5 of Schölkopf and Smola (2002), more technical overview papers such as Bousquet et al. (2003), Mendelson (2003), Boucheron et al. (2005), Herbrich and Williamson (2002), and the monograph Devroye et al. (1996).

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Machine Learning

0810.4752

Country:

Europe (0.87)
North America > United States > Massachusetts (0.28)

Genre:

Research Report (0.63)
Overview (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

On Similarities between Inference in Game Theory and Machine Learning

Rezek, I., Leslie, D. S., Reece, S., Roberts, S. J., Rogers, A., Dash, R. K., Jennings, N. R.

Journal of Artificial Intelligence ResearchOct-23-2008

In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both ﬁelds, and as proof of the usefulness of this approach, we use recent developments in each ﬁeld to make useful improvements to the other. More speciﬁcally, we consider the analogies between smooth best responses in ﬁctitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard ﬁctitious play) we derive a novel moderated ﬁctitious play algorithm and show that it is more likely than standard ﬁctitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean ﬁeld variational learning algorithms. We ﬁrst show that the standard update rule of mean ﬁeld variational learning is analogous to a Cournot adjustment within game theory. By analogy with ﬁctitious play, we then suggest an improved update rule, and show that this results in ﬁctitious variational play, an improved mean ﬁeld variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in ﬁctitious play, namely dynamic ﬁctitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution).

algorithm, fictitious play, variational algorithm, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2523

AI Access Foundation

10574

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

The many faces of optimism - Extended version

Szita, István, Lőrincz, András

arXiv.org Artificial IntelligenceOct-19-2008

The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.

algorithm, artificial intelligence, upstream oil & gas, (19 more...)

arXiv.org Artificial Intelligence

0810.3451

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Rigorously Bayesian Beam Model and an Adaptive Full Scan Model for Range Finders in Dynamic Environments

De Laet, T., De Schutter, J., Bruyninckx, H.

Journal of Artificial Intelligence ResearchOct-16-2008

This paper proposes and experimentally validates a Bayesian network model of a range finder adapted to dynamic environments. All modeling assumptions are rigorously explained, and all model parameters have a physical interpretation. This approach results in a transparent and intuitive model. With respect to the state of the art beam model this paper: (i) proposes a different functional form for the probability of range measurements caused by unmodeled objects, (ii) intuitively explains the discontinuity encountered in te state of the art beam model, and (iii) reduces the number of model parameters, while maintaining the same representational power for experimental data. The proposed beam model is called RBBM, short for Rigorously Bayesian Beam Model. A maximum likelihood and a variational Bayesian estimator (both based on expectation-maximization) are proposed to learn the model parameters. Furthermore, the RBBM is extended to a full scan model in two steps: first, to a full scan model for static environments and next, to a full scan model for general, dynamic environments. The full scan model accounts for the dependency between beams and adapts to the local sample density when using a particle filter. In contrast to Gaussian-based state of the art models, the proposed full scan model uses a sample-based approximation. This sample-based approximation enables handling dynamic environments and capturing multi-modality, which occurs even in simple static environments.

approximation, probability, rigorously bayesian beam model, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2540

AI Access Foundation

10572

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Determining the Unithood of Word Sequences using a Probabilistic Approach

Wong, Wilson, Liu, Wei, Bennamoun, Mohammed

arXiv.org Artificial IntelligenceOct-1-2008

Most research related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, novelties are rare in this small sub-field of term extraction. In addition, existing work were mostly empirically motivated and derived. We propose a new probabilistically-derived measure, independent of any influences of termhood, that provides dedicated measures to gather linguistic evidence from parsed text and statistical evidence from Google search engine for the measurement of unithood. Our comparative study using 1,825 test cases against an existing empirically-derived function revealed an improvement in terms of precision, recall and accuracy.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

0810.0139

Country:

Europe (0.68)
Asia (0.68)
Oceania > Australia (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Networks of Influence Diagrams: A Formalism for Representing Agents' Beliefs and Decision-Making Processes

Gal, Y., Pfeffer, A.

Journal of Artificial Intelligence ResearchSep-26-2008

This paper presents Networks of Influence Diagrams (NID), a compact, natural and highly expressive language for reasoning about agents' beliefs and decision-making processes. NIDs are graphical structures in which agents' mental models are represented as nodes in a network; a mental model for an agent may itself use descriptions of the mental models of other agents. NIDs are demonstrated by examples, showing how they can be used to describe conflicting and cyclic belief structures, and certain forms of bounded rationality. In an opponent modeling domain, NIDs were able to outperform other computational agents whose strategies were not known in advance. NIDs are equivalent in representation to Bayesian games but they are more compact and structured than this formalism. In particular, the equilibrium definition for NIDs makes an explicit distinction between agents' optimal strategies, and how they actually behave in reality.

agent, bob, top-level block, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2503

AI Access Foundation

10570

Journal of Artificial Intelligence Research

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry:

Leisure & Entertainment > Sports > Baseball (0.50)
Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Anytime Induction of Low-cost, Low-error Classifiers: a Sampling-based Approach

Esmeir, S., Markovitch, S.

Journal of Artificial Intelligence ResearchSep-17-2008

Machine learning techniques are gaining prevalence in the production of a wide range of classifiers for complex real-world applications with nonuniform testing and misclassification costs. The increasing complexity of these applications poses a real challenge to resource management during learning and classification. In this work we introduce ACT (anytime cost-sensitive tree learner), a novel framework for operating in such complex environments. ACT is an anytime algorithm that allows learning time to be increased in return for lower classification costs. It builds a tree top-down and exploits additional time resources to obtain better estimations for the utility of the different candidate splits. Using sampling techniques, ACT approximates the cost of the subtree under each candidate split and favors the one with a minimal cost. As a stochastic algorithm, ACT is expected to be able to escape local minima, into which greedy methods may be trapped. Experiments with a variety of datasets were conducted to compare ACT to the state-of-the-art cost-sensitive tree learners. The results show that for the majority of domains ACT produces significantly less costly trees. ACT also exhibits good anytime behavior with diminishing returns.

algorithm, dataset, misclassification cost, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2602

AI Access Foundation

10567

Journal of Artificial Intelligence Research

Country:

North America > United States > Washington > King County > Seattle (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(19 more...)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Finding links and initiators: a graph reconstruction problem

Mannila, Heikki, Terzi, Evimaria

arXiv.org Artificial IntelligenceSep-17-2008

Analyzing 0-1 matrices is one of the main themes in data mining. Techniques such as clustering or mixture modelling, matrix decomposition techniques such as PCA, ICA, and NMR, and Bayesian all aim to give an answer to the informal question: "Where does the matrix come from?" These approaches aim at describing a probabilistic generative model that describes the observed matrix well. In this paper we consider yet another way of answering the question "Where does a 0-1 matrix M come from?" In our model, the matrix M of size n m is considered to arise from initiators, certain few entries that are initially 1. The initiators propagate their 1's by following the links of a directed influence graph G (represented by an n n adjacency matrix). We denote the initiator matrix of size n m by N and we use G (of size n n) to refer both to the directed graph between the rows of M and as well as its adjacency matrix. Then, we believe that the structure of N and G can tell how a matrix M has been created.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

0809.3027

Country: North America > United States (0.47)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Data Science > Data Mining (0.66)

Add feedback

Latent Tree Models and Approximate Inference in Bayesian Networks

Wang, Y., Zhang, N. L., Chen, T.

Journal of Artificial Intelligence ResearchAug-26-2008

We propose a novel method for approximate inference in Bayesian networks (BNs). The idea is to sample data from a BN, learn a latent tree model (LTM) from the data offline, and when online, make inference with the LTM instead of the original BN. Because LTMs are tree-structured, inference takes linear time. In the meantime, they can represent complex relationship among leaf nodes and hence the approximation accuracy is often good. Empirical evidence shows that our method can achieve good approximation accuracy at low online computational cost.

cardinality, inferential complexity, latent variable, (8 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2530

AI Access Foundation

10564

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback