AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

$L_0$ regularized estimation for nonlinear models that have sparse underlying linear structures

Chi, Zhiyi

arXiv.org Machine LearningOct-13-2009

We study the estimation of $\beta$ for the nonlinear model $y = f(X\sp{\top}\beta) + \epsilon$ when $f$ is a nonlinear transformation that is known, $\beta$ has sparse nonzero coordinates, and the number of observations can be much smaller than that of parameters ($n\ll p$). We show that in order to bound the $L_2$ error of the $L_0$ regularized estimator $\hat\beta$, i.e., $\|\hat\beta - \beta\|_2$, it is sufficient to establish two conditions. Based on this, we obtain bounds of the $L_2$ error for (1) $L_0$ regularized maximum likelihood estimation (MLE) for exponential linear models and (2) $L_0$ regularized least square (LS) regression for the more general case where $f$ is analytic. For the analytic case, we rely on power series expansion of $f$, which requires taking into account the singularities of $f$.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

0910.2517

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Finite element model selection using Particle Swarm Optimization

Mthembu, Linda, Marwala, Tshilidzi, Friswell, Michael I., Adhikari, Sondipon

arXiv.org Artificial IntelligenceOct-12-2009

This paper proposes the application of particle swarm optimization (PSO) to the problem of finite element model (FEM) selection. This problem arises when a choice of the best model for a system has to be made from set of competing models, each developed a priori from engineering judgment. PSO is a population-based stochastic search algorithm inspired by the behaviour of biological entities in nature when they are foraging for resources. Each potentially correct model is represented as a particle that exhibits both individualistic and group behaviour. Each particle moves within the model search space looking for the best solution by updating the parameters values that define it. The most important step in the particle swarm algorithm is the method of representing models which should take into account the number, location and variables of parameters to be updated. One example structural system is used to show the applicability of PSO in finding an optimal FEM. An optimal model is defined as the model that has the least number of updated parameters and has the smallest parameter variable variation from the mean material properties. Two different objective functions are used to compare performance of the PSO algorithm.

algorithm, finite element model, objective function, (13 more...)

arXiv.org Artificial Intelligence

0910.2217

Country:

Europe > United Kingdom (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Africa > South Africa > Gauteng > Johannesburg (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

BRAINSTORMING: Consensus Learning in Practice

Plewczynski, Dariusz

arXiv.org Machine LearningOct-6-2009

Keywords: machine learning, consensus, meta-learning, bioinformatics, chemoinformatics, brainstorming, neural networks, support vector machines, decision trees, random forest, genetic algorithms, nearest neighbours, trend vectors Abstract: We present here an introduction to Brainstorming approach, that was recently proposed as a consensus meta-learning technique, and used in several practical applications in bioinformatics and chemoinformatics. In the second step all solutions are gathered and the consensus is build between them. Therefore no early solution, given even by a generally low performing algorithm, is not discarder until the late phase of prediction, when the final conclusion is drawn by comparing different machine learning models. This final phase, i.e. consensus learning, is trying to balance the generality of solution and the overall performance of trained model. 1 INTRODUCTION A novel meta-approach emerging in bioinformatics is called consensus learning. Then all solutions are gathered, and the consensus is build between them. Therefore no early solution, given even by a generally low performing algorithm, is not discarder until the late phase of prediction, when the final conclusion is drawn by comparing different machine learning models.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Machine Learning

0910.0949

Country: Europe > Poland (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.57)

Add feedback

Statistical Decision Making for Authentication and Intrusion Detection

Dimitrakakis, Christos, Mitrokotsa, Aikaterini

arXiv.org Machine LearningOct-5-2009

Classification is the problem of categorising data in one of two or more possible classes. In the classical supervised learning framework, examples of each class have already been obtained and the task of the decision maker is to accurately categorise new observations, whose class is unknown. The accuracy is either measured in terms of the rate of misclassification, or in terms of the average cost, for problems where different types of errors carry different costs. In that setting, the problem has three phases: (a) the collection of training data, (b) the estimation of a decision rule based on the training data and (c) the application 1 of the decision rule to new data. Typically, the decision rule remains fixed after the second step.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

0910.0483

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Discrete MDL Predicts in Total Variation

Hutter, Marcus

arXiv.org Machine LearningSep-24-2009

The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for non-i.i.d. domains like time-series forecasting, discriminative learning, and reinforcement learning are discussed.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

0909.4588

Country:

Europe (0.46)
North America (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities

Delgado, Karina Valdivia (University of Sao Paulo) | Sanner, Scott (NICTA-ANU) | Barros, Leliane Nunes de (University of Sao Paulo) | Cozman, Fabio Gagliardi (University of Sao Paulo)

AAAI ConferencesSep-19-2009

When modeling real-world decision-theoretic planning problems in the Markov decision process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while solutions to the MDP-IP are well-known, they require nonlinear optimization and are extremely time-consuming in practice. To address this deficiency, we propose efficient dynamic programming methods to exploit the structure of factored MDPIPs. Noting that the key computational bottleneck in the solution of MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional “flat” dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs.

algorithm, mdp-ip, probability, (15 more...)

AAAI Conferences

Nineteenth International Conference on Automated Planning and Scheduling

Country:

South America > Brazil > São Paulo (0.05)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Africa > Togo (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

A Bayesian Framework for Collaborative Multi-Source Signal Detection

Couillet, Romain, Debbah, Merouane

arXiv.org Artificial IntelligenceSep-8-2009

This paper introduces a Bayesian framework to detect multiple signals embedded in noisy observations from a sensor array. For various states of knowledge on the communication channel and the noise at the receiving sensors, a marginalization procedure based on recent tools of finite random matrix theory, in conjunction with the maximum entropy principle, is used to compute the hypothesis selection criterion. Quite remarkably, explicit expressions for the Bayesian detector are derived which enable to decide on the presence of signal sources in a noisy wireless environment. The proposed Bayesian detector is shown to outperform the classical power detector when the noise power is known and provides very good performance for limited knowledge on the noise power. Simulations corroborate the theoretical results and quantify the gain achieved using the proposed Bayesian framework.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

0811.0764

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies

Blei, David M., Griffiths, Thomas L., Jordan, Michael I.

arXiv.org Machine LearningAug-27-2009

We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning--the use of Bayesian nonparametric methods to infer distributions on flexible data structures.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

0710.0845

Country: North America > United States > California (0.46)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Consumer Products & Services > Restaurants (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Kronecker Graphs: An Approach to Modeling Networks

Leskovec, Jure, Chakrabarti, Deepayan, Kleinberg, Jon, Faloutsos, Christos, Ghahramani, Zoubin

arXiv.org Machine LearningAug-21-2009

How can we model networks with a mathematically tractable model that allows for rigorous analysis of network properties? Networks exhibit a long list of surprising properties: heavy tails for the degree distribution; small diameters; and densification and shrinking diameters over time. Most present network models either fail to match several of the above properties, are complicated to analyze mathematically, or both. In this paper we propose a generative model for networks that is both mathematically tractable and can generate networks that have the above mentioned properties. Our main idea is to use the Kronecker product to generate graphs that we refer to as "Kronecker graphs". First, we prove that Kronecker graphs naturally obey common network properties. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks. We then present KronFit, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take super- exponential time. In contrast, KronFit takes linear time, by exploiting the structure of Kronecker matrix multiplication and by using statistical simulation techniques. Experiments on large real and synthetic networks show that KronFit finds accurate parameters that indeed very well mimic the properties of target networks. Once fitted, the model parameters can be used to gain insights about the network structure, and the resulting synthetic graphs can be used for null- models, anonymization, extrapolations, and graph summarization.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

0812.4905

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

A hierarchical Dirichlet process mixture model for haplotype reconstruction from multi-population data

Sohn, Kyung-Ah, Xing, Eric P.

arXiv.org Machine LearningAug-20-2009

The perennial problem of "how many clusters?" remains an issue of substantial interest in data mining and machine learning communities, and becomes particularly salient in large data sets such as populational genomic data where the number of clusters needs to be relatively large and open-ended. This problem gets further complicated in a co-clustering scenario in which one needs to solve multiple clustering problems simultaneously because of the presence of common centroids (e.g., ancestors) shared by clusters (e.g., possible descents from a certain ancestor) from different multiple-cluster samples (e.g., different human subpopulations). In this paper we present a hierarchical nonparametric Bayesian model to address this problem in the context of multi-population haplotype inference. Uncovering the haplotypes of single nucleotide polymorphisms is essential for many biological and medical applications. While it is uncommon for the genotype data to be pooled from multiple ethnically distinct populations, few existing programs have explicitly leveraged the individual ethnic information for haplotype inference. In this paper we present a new haplotype inference program, Haploi, which makes use of such information and is readily applicable to genotype sequences with thousands of SNPs from heterogeneous populations, with competent and sometimes superior speed and accuracy comparing to the state-of-the-art programs. Underlying Haploi is a new haplotype distribution model based on a nonparametric Bayesian formalism known as the hierarchical Dirichlet process, which represents a tractable surrogate to the coalescent process. The proposed model is exchangeable, unbounded, and capable of coupling demographic information of different populations.

artificial intelligence, haplotype, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOAS225

0812.4648

Country: North America > United States > Pennsylvania (0.28)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback