AITopics

doi: 10.1007/s11222-014-9461-5

1308.2302

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Nishihara, Robert, Minka, Thomas, Tarlow, Daniel

Detecting Parameter Symmetries in Probabilistic Models

arXiv.org Machine LearningDec-18-2013

Probabilistic models play a central role in modern machine learning. They offer a powerful framework for learning from data, and they have found applications in a variety of scientific fields beyond machine learning. A longstanding goal in machine learning and statistics is to achieve a separation between modeling and inference, so that users of these tools may focus on specifying models without having to implement new inference algorithms every time the models change. Recently, work in probabilistic programming has taken up this challenge, seeking to unify probabilistic modeling with computer programming in order to dramatically increase the effectiveness of machine learning experts (DARPA, 2013) and to equip non-experts with effective tools for specifying models and performing inference. We anticipate that continued success toward these goals will decrease the reliance of machine learning practitioners on tried-and-true models and will shift the community toward a paradigm grounded in flexible tools for rapidly prototyping and designing new models (Bishop, 2013).

artificial intelligence, bayesian inference, machine learning, (15 more...)

1312.5386

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Guez, Arthur, Silver, David, Dayan, Peter

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

arXiv.org Artificial IntelligenceDec-18-2013

Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1205.3109

Country: North America (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(2 more...)

Frigola, Roger, Lindsten, Fredrik, Schön, Thomas B., Rasmussen, Carl E.

Identification of Gaussian Process State-Space Models with Particle Stochastic Approximation EM

Dept. of Information Technology, Uppsala University, Sweden.Abstract: Gaussian process state-space models (GP-SSMs) are a very flexible family of models of nonlinear dynamical systems. They comprise a Bayesian nonparametric representation of the dynamics of the system and additional (hyper-)parameters governing the properties of this nonparametric representation. The Bayesian formalism enables systematic reasoning about the uncertainty in the system dynamics. We present an approach to maximum likelihood identification of the parameters in GP-SSMs, while retaining the full nonparametric description of the dynamics. The method is based on a stochastic approximation version of the EM algorithm that employs recent developments in particle Markov chain Monte Carlo for efficient identification. INTRODUCTION Inspired by recent developments in robotics and machine learning, we aim at constructing models of nonlinear dynamical systems capable of quantifying the uncertainty in their predictions.

artificial intelligence, gp-ssm, machine learning, (17 more...)

1312.4852

Country:

Europe > United Kingdom > England (0.28)
Europe > Sweden > Uppsala County > Uppsala (0.24)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

The Matrix Ridge Approximation: Algorithms and Applications

Zhang, Zhihua

We are concerned with an approximation problem for a symmetric positive semidefinite matrix due to motivation from a class of nonlinear machine learning methods. We discuss an approximation approach that we call {matrix ridge approximation}. In particular, we define the matrix ridge approximation as an incomplete matrix factorization plus a ridge term. Moreover, we present probabilistic interpretations using a normal latent variable model and a Wishart model for this approximation approach. The idea behind the latent variable model in turn leads us to an efficient EM iterative method for handling the matrix ridge approximation problem. Finally, we illustrate the applications of the approximation approach in multivariate data analysis. Empirical studies in spectral clustering and Gaussian process regression show that the matrix ridge approximation with the EM iteration is potentially useful.

approximation, artificial intelligence, machine learning, (17 more...)

1312.4717

Country: Asia (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.87)

Frigola, Roger, Lindsten, Fredrik, Schön, Thomas B., Rasmussen, Carl E.

Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

State-space models are successfully used in many areas of science, engineering and economics to model time series and dynamical systems. We present a fully Bayesian approach to inference \emph{and learning} (i.e. state estimation and system identification) in nonlinear nonparametric state-space models. We place a Gaussian process prior over the state transition dynamics, resulting in a flexible model able to capture complex dynamical phenomena. To enable efficient inference, we marginalize over the transition dynamics function and infer directly the joint smoothing distribution using specially tailored Particle Markov Chain Monte Carlo samplers. Once a sample from the smoothing distribution is computed, the state transition predictive distribution can be formulated analytically. Our approach preserves the full nonparametric expressivity of the model and can make use of sparse Gaussian processes to greatly reduce computational complexity.

artificial intelligence, machine learning, trajectory, (16 more...)

1306.2861

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Stable mixed graphs

Sadeghi, Kayvan

In this paper, we study classes of graphs with three types of edges that capture the modified independence structure of a directed acyclic graph (DAG) after marginalisation over unobserved variables and conditioning on selection variables using the $m$-separation criterion. These include MC, summary, and ancestral graphs. As a modification of MC graphs, we define the class of ribbonless graphs (RGs) that permits the use of the $m$-separation criterion. RGs contain summary and ancestral graphs as subclasses, and each RG can be generated by a DAG after marginalisation and conditioning. We derive simple algorithms to generate RGs, from given DAGs or RGs, and also to generate summary and ancestral graphs in a simple way by further extension of the RG-generating algorithm. This enables us to develop a parallel theory on these three classes and to study the relationships between them as well as the use of each class.

artificial intelligence, graph, machine learning, (19 more...)

doi: 10.3150/12-BEJ454

1110.4168

Country:

Europe > United Kingdom (0.46)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

arXiv.org Machine LearningDec-16-2013

Probable convexity and its application to Correlated Topic Models

Than, Khoat, Ho, Tu Bao

Non-convex optimization problems often arise from probabilistic modeling, such as estimation of posterior distributions. Non-convexity makes the problems intractable, and poses various obstacles for us to design efficient algorithms. In this work, we attack non-convexity by first introducing the concept of \emph{probable convexity} for analyzing convexity of real functions in practice. We then use the new concept to analyze an inference problem in the \emph{Correlated Topic Model} (CTM) and related nonconjugate models. Contrary to the existing belief of intractability, we show that this inference problem is concave under certain conditions. One consequence of our analyses is a novel algorithm for learning CTM which is significantly more scalable and qualitative than existing methods. Finally, we highlight that stochastic gradient algorithms might be a practical choice to resolve efficiently non-convex problems. This finding might find beneficial in many contexts which are beyond probabilistic modeling.

algorithm, blei and lafferty, matrix, (13 more...)

1312.4527

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Japan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Fernique, Pierre, Durand, Jean-Baptiste, Guédon, Yann

Parametric Modelling of Multivariate Count Data Using Probabilistic Graphical Models

arXiv.org Machine LearningDec-16-2013

Multivariate count data are defined as the number of items of different categories issued from sampling within a population, which individuals are grouped into categories. The analysis of multivariate count data is a recurrent and crucial issue in numerous modelling problems, particularly in the fields of biology and ecology (where the data can represent, for example, children counts associated with multitype branching processes), sociology and econometrics. We focus on I) Identifying categories that appear simultaneously, or on the contrary that are mutually exclusive. This is achieved by identifying conditional independence relationships between the variables; II)Building parsimonious parametric models consistent with these relationships; III) Characterising and testing the effects of covariates on the joint distribution of the counts. To achieve these goals, we propose an approach based on graphical probabilistic models, and more specifically partially directed acyclic graphs.

artificial intelligence, graph, machine learning, (14 more...)

1312.4479

Country: Europe > France (0.31)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.96)

Allahverdyan, Armen E., Galstyan, Aram

Comparative Analysis of Viterbi Training and Maximum Likelihood Estimation for HMMs

arXiv.org Machine LearningDec-16-2013

We present an asymptotic analysis of Viterbi Training (VT) and contrast it with a more conventional Maximum Likelihood (ML) approach to parameter estimation in Hidden Markov Models. While ML estimator works by (locally) maximizing the likelihood of the observed data, VT seeks to maximize the probability of the most likely hidden state sequence. We develop an analytical framework based on a generating function formalism and illustrate it on an exactly solvable model of HMM with one unambiguous symbol. For this particular model the ML objective function is continuously degenerate. VT objective, in contrast, is shown to have only finite degeneracy. Furthermore, VT converges faster and results in sparser (simpler) models, thus realizing an automatic Occam's razor for HMM learning. For more general scenario VT can be worse compared to ML but still capable of correctly recovering most of the parameters.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1312.4551

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)