AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Hierarchical Bayesian Models for Latent Attribute Detection in Social Media

Rao, Delip (Johns Hopkins University) | Paul, Michael (Johns Hopkins University) | Fink, Clay (Johns Hopkins University) | Yarowsky, David (Johns Hopkins University) | Oates, Timothy (University of Maryland Baltimore County) | Coppersmith, Glen (JHU Human Language Technology Center of Excellence)

AAAI ConferencesJul-12-2011

We present several novel minimally-supervised models for detecting latent attributes of social media users, with a focus on ethnicity and gender. Previouswork on ethnicity detection has used coarse-grained widely separated classes of ethnicity and assumed the existence of large amounts of training data such as the US census, simplifying the problem. Instead, we examine content generated by users in addition to name morpho-phonemics to detect ethnicity and gender. Further, weaddress this problem in a challenging setting where the ethnicity classes are more fine grained -- ethnicity classes in Nigeria -- and with very limited training data.

artificial intelligence, machine learning, social media, (12 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country:

Africa > Nigeria (0.25)
North America > United States > Maryland > Baltimore (0.14)
North America > United States > Maryland > Baltimore County (0.14)
(3 more...)

Industry: Information Technology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

Latent Set Models for Two-Mode Network Data

DuBois, Christopher (University of California, Irvine) | Foulds, James (University of California, Irvine) | Smyth, Padhraic (University of California, Irvine)

AAAI ConferencesJul-12-2011

Two-mode networks are a natural representation for many kinds of relational data. These networks are bipartite graphs consisting of two distinct sets ("modes") of entities. For example, one can model multiple recipient email data as a two-mode network of (a) individuals and (b) the emails that they send or receive. In this work we present a statistical model for two-mode network data which posits that individuals belong to latent sets and that the members of a particular set tend to co-appear. We show how to infer these latent sets from observed data using a Markov chain Monte Carlo inference algorithm. We apply the model to the Enron email corpus, using it to discover interpretable latent structure as well as evaluating its predictive accuracy on a missing data task. Extensions to the model are discussed that incorporate additional side information such as the email's sender or text content, further improving the accuracy of the model.

artificial intelligence, bayesian inference, machine learning, (20 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country:

Asia > Middle East > Jordan (0.05)
Europe > France (0.05)
North America > United States > California > Orange County > Irvine (0.04)
(2 more...)

Industry:

Telecommunications > Networks (0.61)
Information Technology > Networks (0.61)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Finding Consensus Bayesian Network Structures

Peña, Jose M.

arXiv.org Machine LearningJul-11-2011

Suppose that multiple experts (or learning algorithms) provide us with alternative Bayesian network (BN) structures over a domain, and that we are interested in combining them into a single consensus BN structure. Specifically, we are interested in that the consensus BN structure only represents independences all the given BN structures agree upon and that it has as few parameters associated as possible. In this paper, we prove that there may exist several nonequivalent consensus BN structures and that finding one of them is NPhard. Thus, we decide to resort to heuristics to find an approximated consensus BN structure. In this paper, we consider the heuristic proposed in (Matzkevich and Abramson, 1992, 1993a,b). This heuristic builds upon two algorithms, called Methods A and B, for efficiently deriving the minimal directed independence map of a BN structure relative to a given node ordering. Methods A and B are claimed to be correct although no proof is provided (a proof is just sketched). In this paper, we show that Methods A and B are not correct and propose a correction of them.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Machine Learning

1101.1715

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

AI-Based Software Defect Predictors: Applications and Benefits in a Case Study

Misirli, Ayse Tosun (Bogazici University) | Bener, Ayse (Ryerson University) | Kale, Resat (Turkcell Technology)

AI MagazineJul-9-2011

Software defect prediction aims to reduce software testing efforts by guiding testers through the defect-prone sections of software systems. Defect predictors are widely used in organizations to predict defects in order to save time and effort as an alternative to other techniques such as manual code reviews. The usage of a defect prediction model in a real-life setting is difficult because it requires software metrics and defect data from past projects to predict the defect-proneness of new projects. It is, on the other hand, very practical because it is easy to apply, can detect defects using less time and reduces the testing effort. We have built a learning-based defect prediction model for a telecommunication company in the space of one year. In this study, we have briefly explained our model, presented its pay-off and described how we have implemented the model in the company. Furthermore, we compared the performance of our model with that of another testing strategy applied in a pilot project that implemented a new process called Team Software Process (TSP). Our results show that defect predictors can predict 87 percent of code defects, decrease inspection efforts by 72 percent and hence, reduces post-release defects by 44 percent. Furthermore, they can be used as complementary tools for a new process implementation whose effects on testing activities are limited.

artificial intelligence, bayesian inference, machine learning, (16 more...)

AI Magazine

Country:

North America > United States (1.00)
Europe (0.68)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Networks (0.48)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Loss-sensitive Training of Probabilistic Conditional Random Fields

Volkovs, Maksims N., Larochelle, Hugo, Zemel, Richard S.

arXiv.org Machine LearningJul-9-2011

We consider the problem of training probabilistic conditional random fields (CRFs) in the context of a task where performance is measured using a specific loss function. While maximum likelihood is the most common approach to training CRFs, it ignores the inherent structure of the task's loss function. We describe alternatives to maximum likelihood which take that loss into account. These include a novel adaptation of a loss upper bound from the structured SVMs literature to the CRF context, as well as a new loss-inspired KL divergence objective which relies on the probabilistic nature of CRFs. These loss-sensitive objectives are compared to maximum likelihood using ranking as a benchmark task. This comparison confirms the importance of incorporating loss information in the probabilistic training of CRFs, with the loss-inspired KL outperforming all other objectives.

artificial intelligence, machine learning, objective, (18 more...)

arXiv.org Machine Learning

1107.1805

Country:

North America > United States (0.93)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.77)

Add feedback

Predicting Solution Cost with Conditional Probabilities

Lelis, Levi (University of Alberta) | Stern, Roni (Ben Gurion University) | Arfaee, Shahab Jabbari (University of Alberta)

AAAI ConferencesJul-5-2011

Classical heuristic search algorithms find the solution cost of a problem while finding the path from the start state to a goal state. However, there are applications in which finding the path is not needed. In this paper we propose an algorithm that accurately and efficiently predicts the solution cost of a problem without finding the actual solution. We show empirically that our predictor makes more accurate predictions when compared to the bootstrapped heuristic, which is known to be a very accurate inadmissible heuristic. In addition, we show how our prediction algorithm can be used to enhance heuristic search algorithms. Namely, we use our predictor to calculate a bound for a bounded best-first search algorithm and to tune the w-value of Weighted IDA*. In both cases major search speedups were observed.

algorithm, prediction, solution cost, (17 more...)

AAAI Conferences

Fourth Annual Symposium on Combinatorial Search

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

A Variational Bayes Approach to Decoding in a Phase-Uncertain Digital Receiver

Das, Arijit, Quinn, Anthony

arXiv.org Machine LearningJul-4-2011

This paper presents a Bayesian approach to symbol and phase inference in a phase-unsynchronized digital receiver. It primarily extends [Quinn 2011] to the multi-symbol case, using the variational Bayes (VB) approximation to deal with the combinatorial complexity of the phase inference in this case. The work provides a fully Bayesian extension of the EM-based framework underlying current turbo-synchronization methods, since it induces a von Mises prior on the time-invariant phase parmeter. As a result, we achieve tractable iterative algorithms with improved robustness in low SNR regimes, compared to the current EM-based approaches. As a corollary to our analysis we also discover the importance of prior regularization in elegantly tackling the significant problem of phase ambiguity.

artificial intelligence, machine learning, posterior distribution, (15 more...)

arXiv.org Machine Learning

1107.0662

Country: Europe > Ireland > Leinster > County Dublin > Dublin (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Law of Connectivity in Machine Learning

Dundas, Jitesh

arXiv.org Artificial IntelligenceJul-1-2011

We present in this paper our law that there is always a connection present between two entities, with a selfconnection being present at least in each node. An entity is an object, physical or imaginary, that is connected by a path (or connection) and which is important for achieving the desired result of the scenario. In machine learning, we state that for any scenario, a subject entity is always, directly or indirectly, connected and affected by single or multiple independent / dependent entities, and their impact on the subject entity is dependent on various factors falling into the categories such as the existenc

artificial intelligence, interaction, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1107.0194

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Sequential Diagnosis by Abstraction

Siddiqi, S. A., Huang, J.

Journal of Artificial Intelligence ResearchJun-30-2011

When a system behaves abnormally, sequential diagnosis takes a sequence of measurements of the system until the faults causing the abnormality are identified, and the goal is to reduce the diagnostic cost, defined here as the number of measurements. To propose measurement points, previous work employs a heuristic based on reducing the entropy over a computed set of diagnoses. This approach generally has good performance in terms of diagnostic cost, but can fail to diagnose large systems when the set of diagnoses is too large. Focusing on a smaller set of probable diagnoses scales the approach but generally leads to increased average diagnostic costs. In this paper, we propose a new diagnostic framework employing four new techniques, which scales to much larger systems with good performance in terms of diagnostic cost. First, we propose a new heuristic for measurement point selection that can be computed efficiently, without requiring the set of diagnoses, once the system is modeled as a Bayesian network and compiled into a logical form known as d-DNNF. Second, we extend hierarchical diagnosis, a technique based on system abstraction from our previous work, to handle probabilities so that it can be applied to sequential diagnosis to allow larger systems to be diagnosed. Third, for the largest systems where even hierarchical diagnosis fails, we propose a novel method that converts the system into one that has a smaller abstraction and whose diagnoses form a superset of those of the original system; the new system can then be diagnosed and the result mapped back to the original system. Finally, we propose a novel cost estimation function which can be used to choose an abstraction of the system that is more likely to provide optimal average cost. Experiments with ISCAS-85 benchmark circuits indicate that our approach scales to all circuits in the suite except one that has a flat structure not susceptible to useful abstraction.

abstraction, cone, diagnosis, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3296

AI Access Foundation

10713

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms

Derbeko, P., El-Yaniv, R., Meir, R.

arXiv.org Artificial IntelligenceJun-30-2011

Inductive learning is based on inferring a general rule from a finite data set and using it to label new data. In transduction one attempts to solve the problem of using a labeled training set to label a set of unlabeled points, which are given to the learner prior to learning. Although transduction seems at the outset to be an easier task than induction, there have not been many provably useful algorithms for transduction. Moreover, the precise relation between induction and transduction has not yet been determined. The main theoretical developments related to transduction were presented by Vapnik more than twenty years ago. One of Vapnik's basic results is a rather tight error bound for transductive classification based on an exact computation of the hypergeometric tail. While tight, this bound is given implicitly via a computational routine. Our first contribution is a somewhat looser but explicit characterization of a slightly extended PAC-Bayesian version of Vapnik's transductive bound. This characterization is obtained using concentration inequalities for the tail of sums of random variables obtained by sampling without replacement. We then derive error bounds for compression schemes such as (transductive) support vector machines and for transduction algorithms based on clustering. The main observation used for deriving these new error bounds and algorithms is that the unlabeled test points, which in the transductive setting are known in advance, can be used in order to construct useful data dependent prior distributions over the hypothesis space.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1417

1107.0046

Country:

North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback