AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Mathematical Language Processing: Automatic Grading and Feedback for Open Response Mathematical Questions

Lan, Andrew S., Vats, Divyanshu, Waters, Andrew E., Baraniuk, Richard G.

arXiv.org Machine LearningJan-18-2015

While computer and communication technologies have provided effective means to scale up many aspects of education, the submission and grading of assessments such as homework assignments and tests remains a weak link. In this paper, we study the problem of automatically grading the kinds of open response mathematical questions that figure prominently in STEM (science, technology, engineering, and mathematics) courses. Our data-driven framework for mathematical language processing (MLP) leverages solution data from a large number of learners to evaluate the correctness of their solutions, assign partial-credit scores, and provide feedback to each learner on the likely locations of any errors. MLP takes inspiration from the success of natural language processing for text data and comprises three main steps. First, we convert each solution to an open response mathematical question into a series of numerical features. Second, we cluster the features from several solutions to uncover the structures of correct, partially correct, and incorrect solutions. We develop two different clustering approaches, one that leverages generic clustering algorithms and one based on Bayesian nonparametrics. Third, we automatically grade the remaining (potentially large number of) solutions based on their assigned cluster and one instructor-provided grade per cluster. As a bonus, we can track the cluster assignment of each step of a multistep solution and determine when it departs from a cluster of correct solutions, which enables us to indicate the likely locations of errors to learners. We test and validate MLP on real-world MOOC data to demonstrate how it can substantially reduce the human effort required in large-scale educational platforms.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1501.04346

Country: North America (0.28)

Genre: Instructional Material > Online (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
(2 more...)

Add feedback

Structure Learning in Bayesian Networks of Moderate Size by Efficient Sampling

He, Ru, Tian, Jin, Wu, Huaiqing

arXiv.org Machine LearningJan-18-2015

We study the Bayesian model averaging approach to learning Bayesian network structures (DAGs) from data. We develop new algorithms including the first algorithm that is able to efficiently sample DAGs according to the exact structure posterior. The DAG samples can then be used to construct estimators for the posterior of any feature. We theoretically prove good properties of our estimators and empirically show that our estimators considerably outperform the estimators from the previous state-of-the-art methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1501.0437

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Prediction and Modularity in Dynamical Systems

Kolchinsky, Artemy, Rocha, Luis M.

arXiv.org Artificial IntelligenceJan-16-2015

Identifying and understanding modular organizations is centrally important in the study of complex systems. Several approaches to this problem have been advanced, many framed in information-theoretic terms. Our treatment starts from the complementary point of view of statistical modeling and prediction of dynamical systems. It is known that for finite amounts of training data, simpler models can have greater predictive power than more complex ones. We use the trade-off between model simplicity and predictive accuracy to generate optimal multiscale decompositions of dynamical networks into weakly-coupled, simple modules. State-dependent and causal versions of our method are also proposed.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1106.3703

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Dirichlet Process Parsimonious Mixtures for clustering

Chamroukhi, Faicel, Bartcus, Marius, Glotin, Hervé

arXiv.org Machine LearningJan-14-2015

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models.

artificial intelligence, machine learning, mixture model, (16 more...)

arXiv.org Machine Learning

1501.03347

Country: Europe > France (0.30)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Combined modeling of sparse and dense noise for improvement of Relevance Vector Machine

Sundin, Martin, Chatterjee, Saikat, Jansson, Magnus

arXiv.org Machine LearningJan-12-2015

Using a Bayesian approach, we consider the problem of recovering sparse signals under additive sparse and dense noise. Typically, sparse noise models outliers, impulse bursts or data loss. To handle sparse noise, existing methods simultaneously estimate the sparse signal of interest and the sparse noise of no interest. For estimating the sparse signal, without the need of estimating the sparse noise, we construct a robust Relevance Vector Machine (RVM). In the RVM, sparse noise and ever present dense noise are treated through a combined noise model. The precision of combined noise is modeled by a diagonal matrix. We show that the new RVM update equations correspond to a non-symmetric sparsity inducing cost function. Further, the combined modeling is found to be computationally more efficient. We also extend the method to block-sparse signals and noise with known and unknown block structures. Through simulations, we show the performance and computation efficiency of the new RVM in several applications: recovery of sparse and block sparse signals, housing price prediction and image denoising.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1501.02579

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Identifiability and optimal rates of convergence for parameters of multiple types in finite mixtures

Ho, Nhat, Nguyen, XuanLong

arXiv.org Machine LearningJan-11-2015

This paper studies identifiability and convergence behaviors for parameters of multiple types in finite mixtures, and the effects of model fitting with extra mixing components. First, we present a general theory for strong identifiability, which extends from the previous work of Nguyen [2013] and Chen [1995] to address a broad range of mixture models and to handle matrix-variate parameters. These models are shown to share the same Wasserstein distance based optimal rates of convergence for the space of mixing distributions --- $n^{-1/2}$ under $W_1$ for the exact-fitted and $n^{-1/4}$ under $W_2$ for the over-fitted setting, where $n$ is the sample size. This theory, however, is not applicable to several important model classes, including location-scale multivariate Gaussian mixtures, shape-scale Gamma mixtures and location-scale-shape skew-normal mixtures. The second part of this work is devoted to demonstrating that for these "weakly identifiable" classes, algebraic structures of the density family play a fundamental role in determining convergence rates of the model parameters, which display a very rich spectrum of behaviors. For instance, the optimal rate of parameter estimation in an over-fitted location-covariance Gaussian mixture is precisely determined by the order of a solvable system of polynomial equations --- these rates deteriorate rapidly as more extra components are added to the model. The established rates for a variety of settings are illustrated by a simulation study.

convergence, identifiability, mixture model, (16 more...)

arXiv.org Machine Learning

1501.02497

Country: North America > United States > Michigan (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Fast and optimal nonparametric sequential design for astronomical observations

Yang, Justin J., Wang, Xufei, Protopapas, Pavlos, Bornn, Luke

arXiv.org Machine LearningJan-11-2015

The spectral energy distribution (SED) is a relatively easy way for astronomers to distinguish between different astronomical objects such as galaxies, black holes, and stellar objects. By comparing the observations from a source at different frequencies with template models, astronomers are able to infer the type of this observed object. In this paper, we take a Bayesian model averaging perspective to learn astronomical objects, employing a Bayesian nonparametric approach to accommodate the deviation from convex combinations of known log-SEDs. To effectively use telescope time for observations, we then study Bayesian nonparametric sequential experimental design without conjugacy, in which we use sequential Monte Carlo as an efficient tool to maximize the volume of information stored in the posterior distribution of the parameters of interest. A new technique for performing inferences in log-Gaussian Cox processes called the Poisson log-normal approximation is also proposed. Simulations show the speed, accuracy, and usefulness of our method. While the strategy we propose in this paper is brand new in the astronomy literature, the inferential techniques developed apply to more general nonparametric sequential experimental design problems.

artificial intelligence, machine learning, template, (17 more...)

arXiv.org Machine Learning

1501.02467

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Coherent Predictive Inference under Exchangeability with Imprecise Probabilities

De Cooman, Gert, De Bock, Jasper, Diniz, Márcio Alves

Journal of Artificial Intelligence ResearchJan-10-2015

Coherent reasoning under uncertainty can be represented in a very general manner by coherent sets of desirable gambles. In a context that does not allow for indecision, this leads to an approach that is mathematically equivalent to working with coherent conditional probabilities. If we do allow for indecision, this leads to a more general foundation for coherent (imprecise-)probabilistic inference. In this framework, and for a given finite category set, coherent predictive inference under exchangeability can be represented using Bernstein coherent cones of multivariate polynomials on the simplex generated by this category set. This is a powerful generalisation of de Finetti's Representation Theorem allowing for both imprecision and indecision. We define an inference system as a map that associates a Bernstein coherent cone of polynomials with every finite category set. Many inference principles encountered in the literature can then be interpreted, and represented mathematically, as restrictions on such maps. We discuss, as particular examples, two important inference principles: representation insensitivitya strengthened version of Walley's representation invarianceand specificity. We show that there is an infinity of inference systems that satisfy these two principles, amongst which we discuss in particular the skeptically cautious inference system, the inference systems corresponding to (a modified version of) Walley and Bernard's Imprecise Dirichlet Multinomial Models (IDMM), the skeptical IDMM inference systems, and the Haldane inference system. We also prove that the latter produces the same posterior inferences as would be obtained using Haldane's improper prior, implying that there is an infinity of proper priors that produce the same coherent posterior inferences as Haldane's improper one. Finally, we impose an additional inference principle that allows us to characterise uniquely the immediate predictions for the IDMM inference systems.

category, equation, inference system, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4490

AI Access Foundation

10925

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

The SP theory of intelligence: an overview

Wolff, J. Gerard

arXiv.org Artificial IntelligenceJan-7-2015

This article is an overview of the "SP theory of intelligence". The theory aims to simplify and integrate concepts across artificial intelligence, mainstream computing and human perception and cognition, with information compression as a unifying theme. It is conceived as a brain-like system that receives 'New' information and stores some or all of it in compressed form as 'Old' information. It is realised in the form of a computer model -- a first version of the SP machine. The concept of "multiple alignment" is a powerful central idea. Using heuristic techniques, the system builds multiple alignments that are 'good' in terms of information compression. For each multiple alignment, probabilities may be calculated. These provide the basis for calculating the probabilities of inferences. The system learns new structures from partial matches between patterns. Using heuristic techniques, the system searches for sets of structures that are 'good' in terms of information compression. These are normally ones that people judge to be 'natural', in accordance with the 'DONSVIC' principle -- the discovery of natural structures via information compression. The SP theory may be applied in several areas including 'computing', aspects of mathematics and logic, representation of knowledge, natural language processing, pattern recognition, several kinds of reasoning, information storage and retrieval, planning and problem solving, information compression, neuroscience, and human perception and cognition. Examples include the parsing and production of language including discontinuous dependencies in syntax, pattern recognition at multiple levels of abstraction and its integration with part-whole relations, nonmonotonic reasoning and reasoning with default values, reasoning in Bayesian networks including 'explaining away', causal diagnosis, and the solving of a geometric analogy problem.

machine learning, multiple alignment, pattern recognition, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/info4030283

1306.3888

Country:

North America > United States (0.46)
Europe (0.27)

Genre: Research Report (0.81)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(4 more...)

Add feedback

Inverse Renormalization Group Transformation in Bayesian Image Segmentations

Tanaka, Kazuyuki, Kataoka, Shun, Yasuda, Muneki, Ohzeki, Masayuki

arXiv.org Machine LearningJan-5-2015

Graduate School of Informatics, Kyoto University, 36-1 Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501 Japan A new Bayesian image segmentation algorithm is proposed by combining a loopy belief propagation with an inverse real space renormalization group transformation to reduce the computational time. In results of our experiment, we observe that the proposed method can reduce the computational time to less than one-tenth of that taken by conventional Bayesian approaches. Bayesian segmentation modeling based on Markov random fields (MRF's) is one of the interesting research topics We consider an image as defined on a set of pixels arranged on a square grid graph (V,E). HereV { i i 1, 2,···, V } denotes the set of all the pixels andE is the set of all the nearest-neighbour pairs of pixels{ i,j} . The total numbers of elements in the setsV and E are denoted by V and E, respectively.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.7566/JPSJ.84.045001

1501.00834

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.45)
Asia > Japan > Honshū > Tōhoku (0.15)

Genre: Research Report (0.40)

Industry: Education (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback