AITopics

1506.03805

Country:

North America > Canada > Ontario > Toronto (0.28)
Europe (0.28)

Genre: Research Report (0.64)

Industry:

Education (0.67)
Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

arXiv.org Machine LearningMay-26-2016

Provable Algorithms for Inference in Topic Models

Arora, Sanjeev, Ge, Rong, Koehler, Frederic, Ma, Tengyu, Moitra, Ankur

Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference has proven to be more challenging. Here we take a first step towards provable inference in topic models. We leverage a property of topic models that enables us to construct simple linear estimators for the unknown topic proportions that have small variance, and consequently can work with short documents. Our estimators also correspond to finding an estimate around which the posterior is well-concentrated. We show lower bounds that for shorter documents it can be information theoretically impossible to find the hidden topics. Finally, we give empirical results that demonstrate that our algorithm works on realistic topic models. It yields good solutions on synthetic data and runs in time comparable to a {\em single} iteration of Gibbs sampling.

artificial intelligence, machine learning, natural language, (19 more...)

1605.08491

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

arXiv.org Machine LearningMay-26-2016

Combinatorial Topic Models using Small-Variance Asymptotics

Jiang, Ke, Sra, Suvrit, Kulis, Brian

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In contrast, we study topic modeling as a combinatorial optimization problem, and propose a new objective function derived from LDA by passing to the small-variance limit. We minimize the derived objective by using ideas from combinatorial optimization, which results in a new, fast, and high-quality topic modeling algorithm. In particular, we show that our results are competitive with popular LDA-based topic modeling approaches, and also discuss the (dis)similarities between our approach and its probabilistic counterparts.

artificial intelligence, machine learning, natural language, (17 more...)

1604.02027

Country: North America > United States (0.46)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
(4 more...)

Kucukelbir, Alp, Blei, David M.

Posterior Dispersion Indices

Probabilistic modeling is cyclical: we specify a model, infer its posterior, and evaluate its performance. Evaluation drives the cycle, as we revise our model based on how it performs. This requires a metric. Traditionally, predictive accuracy prevails. Yet, predictive accuracy does not tell the whole story. We propose to evaluate a model through posterior dispersion. The idea is to analyze how each datapoint fares in relation to posterior uncertainty around the hidden structure. We propose a family of posterior dispersion indices (PDI) that capture this idea. A PDI identifies rich patterns of model mismatch in three real data examples: voting preferences, supermarket shopping, and population genetics.

artificial intelligence, likelihood, machine learning, (16 more...)

1605.07604

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Sun, Xing, Yung, Nelson H. C., Lam, Edmund Y., So, Hayden K. -H.

Consistency Analysis for the Doubly Stochastic Dirichlet Process

This technical report proves components consistency for the Doubly Stochastic Dirichlet Process [1] with exponential convergence of posterior probability. We also present the fundamental properties for DSDP as well as inference algorithms. This report is also a support document for the paper "Computationally Efficient Hyperspectral Data Learning Based on the Doubly Stochastic Dirichlet Process" [1]. The probability of data partitions is important in mixture modeling [2]. LetM be the unordered partition ofn observations, then the probability mass function [3] ofM follows.

artificial intelligence, machine learning, probability, (17 more...)

1605.07358

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Humplik, Jan, Tkačik, Gašper

Semiparametric energy-based probabilistic models

Probabilistic models can be defined by an energy function, where the probability of each state is proportional to the exponential of the state's negative energy. This paper considers a generalization of energy-based models in which the probability of a state is proportional to an arbitrary positive, strictly decreasing, and twice differentiable function of the state's energy. The precise shape of the nonlinear map from energies to unnormalized probabilities has to be learned from data together with the parameters of the energy function. As a case study we show that the above generalization of a fully visible Boltzmann machine yields an accurate model of neural activity of retinal ganglion cells. We attribute this success to the model's ability to easily capture distributions whose probabilities span a large dynamic range, a possible consequence of latent variables that globally couple the system. Similar features have recently been observed in many datasets, suggesting that our new method has wide applicability.

artificial intelligence, machine learning, nonlinearity, (18 more...)

1605.07371

Country: North America > United States (0.69)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Learning Nonparametric Forest Graphical Models with Prior Information

Zhu, Yuancheng, Liu, Zhe, Sun, Siqi

We present a framework for incorporating prior information into nonparametric estimation of graphical models. To avoid distributional assumptions, we restrict the graph to be a forest and build on the work of forest density estimation (FDE). We reformulate the FDE approach from a Bayesian perspective, and introduce prior distributions on the graphs. As two concrete examples, we apply this framework to estimating scale-free graphs and learning multiple graphs with similar structures. The resulting algorithms are equivalent to finding a maximum spanning tree of a weighted graph with a penalty term on the connectivity pattern of the graph. We solve the optimization problem via a minorize-maximization procedure with Kruskal's algorithm. Simulations show that the proposed methods outperform competing parametric methods, and are robust to the true data distribution. They also lead to improvement in predictive power and interpretability in two real data sets.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1511.03796

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Bornschein, Jorg, Shabanian, Samira, Fischer, Asja, Bengio, Yoshua

Bidirectional Helmholtz Machines

Efficient unsupervised training and inference in deep generative models remains a challenging problem. One basic approach, called Helmholtz machine, involves training a top-down directed generative model together with a bottom-up auxiliary model used for approximate inference. Recent results indicate that better generative models can be obtained with better approximate inference procedures. Instead of improving the inference procedure, we here propose a new model which guarantees that the top-down and bottom-up distributions can efficiently invert each other. We achieve this by interpreting both the top-down and the bottom-up directed models as approximate inference distributions and by defining the model distribution to be the geometric mean of these two. We present a lower-bound for the likelihood of this model and we show that optimizing this bound regularizes the model so that the Bhattacharyya distance between the bottom-up and top-down approximate distributions is minimized. This approach results in state of the art generative models which prefer significantly deeper architectures while it allows for orders of magnitude more efficient approximate inference.

artificial intelligence, generative model, machine learning, (20 more...)

1506.03877

Country: North America > Canada (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)

arXiv.org Machine LearningMay-23-2016

Bayesian Model Selection of Stochastic Block Models

Yan, Xiaoran

Abstract--A central problem in analyzing networks is partitioning them into modules or communities. One of the best tools for this is the stochastic block model, which clusters vertices into blocks with statistically homogeneous pattern of links. Despite its flexibility and popularity, there has been a lack of principled statistical model selection criteria for the stochastic block model. Here we propose a Bayesian framework for choosing the number of blocks as well as comparing it to the more elaborate degree-corrected block models, ultimately leading to a universal model selection framework capable of comparing multiple modeling combinations. We will also investigate its connection to the minimum description length principle. I NTRODUCTION An important task in network analysis is community detection, or finding groups of similar vertices which can then be analyzed separately [1]. Community structures offer clues to the processes which generated the graph, on scales ranging from face-to-face social interaction [2] through social-media communications [3] to the organization of food webs [4]. However, previous work often defines a "community" as a group of vertices with high density of connections within the group and a low density of connections to the rest of the network. While this type of assortative community structure is generally the case in social networks, we are interested in a more general definition of functional community--a group of vertices that connect to the rest of the network in similar ways. A set of similar predators form a functional group in a food web, not because they eat each other, but because they feed on similar prey.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1605.07057

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kersting, Hans, Hennig, Philipp

Active Uncertainty Calibration in Bayesian ODE Solvers

arXiv.org Machine LearningMay-23-2016

There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al. introduced a sampling-based class of methods that are 'well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al. pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

artificial intelligence, machine learning, solver, (19 more...)

1605.03364

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)