AITopics

1511.04817

Genre: Research Report (0.50)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Fan, Kai, Heller, Katherine

$k$-means: Fighting against Degeneracy in Sequential Monte Carlo with an Application to Tracking

arXiv.org Machine LearningNov-12-2015

For regular particle filter algorithm or Sequential Monte Carlo (SMC) methods, the initial weights are traditionally dependent on the proposed distribution, the posterior distribution at the current timestamp in the sampled sequence, and the target is the posterior distribution of the previous timestamp. This is technically correct, but leads to algorithms which usually have practical issues with degeneracy, where all particles eventually collapse onto a single particle. In this paper, we propose and evaluate using $k$ means clustering to attack and even take advantage of this degeneracy. Specifically, we propose a Stochastic SMC algorithm which initializes the set of $k$ means, providing the initial centers chosen from the collapsed particles. To fight against degeneracy, we adjust the regular SMC weights, mediated by cluster proportions, and then correct them to retain the same expectation as before. We experimentally demonstrate that our approach has better performance than vanilla algorithms.

artificial intelligence, machine learning, particle, (15 more...)

1511.04157

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.85)

Glynn, Chris, Tokdar, Surya T., Banks, David L., Howard, Brian

Bayesian Analysis of Dynamic Linear Topic Models

arXiv.org Machine LearningNov-12-2015

In dynamic topic modeling, the proportional contribution of a topic to a document depends on the temporal dynamics of that topic's overall prevalence in the corpus. We extend the Dynamic Topic Model of Blei and Lafferty (2006) by explicitly modeling document level topic proportions with covariates and dynamic structure that includes polynomial trends and periodicity. A Markov Chain Monte Carlo (MCMC) algorithm that utilizes Polya-Gamma data augmentation is developed for posterior inference. Conditional independencies in the model and sampling are made explicit, and our MCMC algorithm is parallelized where possible to allow for inference in large corpora. To address computational bottlenecks associated with Polya-Gamma sampling, we appeal to the Central Limit Theorem to develop a Gaussian approximation to the Polya-Gamma random variable. This approximation is fast and reliable for parameter values relevant in the text mining domain. Our model and inference algorithm are validated with multiple simulation examples, and we consider the application of modeling trends in PubMed abstracts. We demonstrate that sharing information across documents is critical for accurately estimating document-specific topic proportions. We also show that explicitly modeling polynomial and periodic behavior improves our ability to predict topic prevalence at future time points.

artificial intelligence, machine learning, natural language, (17 more...)

1511.03947

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Negahban, Sahand, Oh, Sewoong, Shah, Devavrat

Rank Centrality: Ranking from Pair-wise Comparisons

arXiv.org Machine LearningNov-12-2015

The question of aggregating pair-wise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR's TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on transactions. In most settings, in addition to obtaining a ranking, finding `scores' for each object (e.g. player's rating) is of interest for understanding the intensity of the preferences. In this paper, we propose Rank Centrality, an iterative rank aggregation algorithm for discovering scores for objects (or items) from pair-wise comparisons. The algorithm has a natural random walk interpretation over the graph of objects with an edge present between a pair of objects if they are compared; the score, which we call Rank Centrality, of an object turns out to be its stationary probability under this random walk. To study the efficacy of the algorithm, we consider the popular Bradley-Terry-Luce (BTL) model (equivalent to the Multinomial Logit (MNL) for pair-wise comparisons) in which each object has an associated score which determines the probabilistic outcomes of pair-wise comparisons between objects. In terms of the pair-wise marginal probabilities, which is the main subject of this paper, the MNL model and the BTL model are identical. We bound the finite sample error rates between the scores assumed by the BTL model and those estimated by our algorithm. In particular, the number of samples required to learn the score well with high probability depends on the structure of the comparison graph. When the Laplacian of the comparison graph has a strictly positive spectral gap, e.g. each item is compared to a subset of randomly chosen items, this leads to dependence on the number of samples that is nearly order-optimal.

artificial intelligence, machine learning, rank centrality, (16 more...)

1209.1688

Country:

Asia (1.00)
Europe > United Kingdom > England (0.45)
North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Chess (0.54)
Leisure & Entertainment > Sports > Cricket (0.46)

Technology:

Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Zhao, Shiwen, Gao, Chuan, Mukherjee, Sayan, Engelhardt, Barbara E

Bayesian group latent factor analysis with structured sparsity

arXiv.org Machine LearningNov-11-2015

Latent factor models are the canonical statistical tool for exploratory analyses of low-dimensional linear structure for an observation matrix with p features across n samples. We develop a structured Bayesian group factor analysis model that extends the factor model to multiple coupled observation matrices; in the case of two observations, this reduces to a Bayesian model of canonical correlation analysis. The main contribution of this work is to carefully define a structured Bayesian prior that encourages both element-wise and column-wise shrinkage and leads to desirable behavior on high-dimensional data. In particular, our model puts a structured prior on the joint factor loading matrix, regularizing at three levels, which enables element-wise sparsity and unsupervised recovery of latent factors corresponding to structured variance across arbitrary subsets of the observations. In addition, our structured prior allows for both dense and sparse latent factors so that covariation among either all features or only a subset of features can both be recovered. We use fast parameter-expanded expectation-maximization for parameter estimation in this model. We validate our method on both simulated data with substantial structure and real data, comparing against a number of state-of-the-art approaches. These results illustrate useful properties of our model, including i) recovering sparse signal in the presence of dense effects; ii) the ability to scale naturally to large numbers of observations; iii) flexible observation- and factor-specific regularization to recover factors with a wide variety of sparsity levels and percentage of variance explained; and iv) tractable inference that scales to modern genomic and document data sizes.

loading matrix, matrix, sparsity, (15 more...)

1411.2698

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.65)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Halpern, Yoni, Horng, Steven, Sontag, David

Anchored Discrete Factor Analysis

arXiv.org Machine LearningNov-10-2015

We present a semi-supervised learning algorithm for learning discrete factor analysis models with arbitrary structure on the latent variables. Our algorithm assumes that every latent variable has an "anchor", an observed variable with only that latent variable as its parent. Given such anchors, we show that it is possible to consistently recover moments of the latent variables and use these moments to learn complete models. We also introduce a new technique for improving the robustness of method-of-moment algorithms by optimizing over the marginal polytope or its relaxations. We evaluate our algorithm using two real-world tasks, tag prediction on questions from the Stack Overflow website and medical diagnosis in an emergency department.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1511.03299

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Internal Medicine (0.94)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Inference in Cumulative Distribution Fields

Silva, Ricardo

One approach for constructing copula functions is by multiplication. Given that products of cumulative distribution functions (CDFs) are also CDFs, an adjustment to this multiplication will result in a copula model, as discussed by Liebscher (J Mult Analysis, 2008). Parameterizing models via products of CDFs has some advantages, both from the copula perspective (e.g., it is well-defined for any dimensionality) and from general multivariate analysis (e.g., it provides models where small dimensional marginal distributions can be easily read-off from the parameters). Independently, Huang and Frey (J Mach Learn Res, 2011) showed the connection between certain sparse graphical models and products of CDFs, as well as message-passing (dynamic programming) schemes for computing the likelihood function of such models. Such schemes allows models to be estimated with likelihood-based methods. We discuss and demonstrate MCMC approaches for estimating such models in a Bayesian context, their application in copula modeling, and how message-passing can be strongly simplified. Importantly, our view of message-passing opens up possibilities to scaling up such methods, given that even dynamic programming is not a scalable solution for calculating likelihood functions in many models.

artificial intelligence, copula, machine learning, (15 more...)

1511.02796

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Guedj, Benjamin, Robbiano, Sylvain

PAC-Bayesian High Dimensional Bipartite Ranking

This paper is devoted to the bipartite ranking problem, a classical statistical learning task, in a high dimensional setting. We propose a scoring and ranking strategy based on the PAC-Bayesian approach. We consider nonlinear additive scoring functions, and we derive non-asymptotic risk bounds under a sparsity assumption. In particular, oracle inequalities in probability holding under a margin condition assess the performance of our procedure, and prove its minimax optimality. An MCMC-flavored algorithm is proposed to implement our method, along with its behavior on synthetic and real-life datasets.

artificial intelligence, inequality, machine learning, (18 more...)

1511.02729

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Decomposition Bounds for Marginal MAP

Ping, Wei, Liu, Qiang, Ihler, Alexander

Marginal MAP inference involves making MAP predictions in systems defined with latent variables or missing information. It is significantly more difficult than pure marginalization and MAP tasks, for which a large class of efficient and convergent variational algorithms, such as dual decomposition, exist. In this work, we generalize dual decomposition to a generic power sum inference task, which includes marginal MAP, along with pure marginalization and MAP, as special cases. Our method is based on a block coordinate descent algorithm on a new convex decomposition bound, that is guaranteed to converge monotonically, and can be parallelized efficiently. We demonstrate our approach on marginal MAP queries defined on real-world problems from the UAI approximate inference challenge, showing that our framework is faster and more reliable than previous methods.

artificial intelligence, machine learning, wmb 0, (17 more...)

1511.02619

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions

Bach, Francis

We show that kernel-based quadrature rules for computing integrals can be seen as a special case of random feature expansions for positive definite kernels, for a particular decomposition that always exists for such kernels. We provide a theoretical analysis of the number of required samples for a given approximation error, leading to both upper and lower bounds that are based solely on the eigenvalues of the associated integral operator and match up to logarithmic terms. In particular, we show that the upper bound may be obtained from independent and identically distributed samples from a specific non-uniform distribution, while the lower bound if valid for any set of points. Applying our results to kernel-based quadrature, while our results are fairly general, we recover known upper and lower bounds for the special cases of Sobolev spaces. Moreover, our results extend to the more general problem of full function approximations (beyond simply computing an integral), with results in L2- and L$\infty$-norm that match known results for special cases. Applying our results to random features, we show an improvement of the number of random features needed to preserve the generalization guarantees for learning with Lipschitz-continuous losses.

artificial intelligence, kernel, machine learning, (18 more...)

1502.068

Country:

North America (0.45)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)