AITopics

1711.0887

Country:

North America > United States (0.93)
Asia > Middle East (0.93)

Genre: Research Report (0.84)

Industry:

Transportation (0.68)
Leisure & Entertainment (0.68)
Law (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Spantini, Alessio, Bigoni, Daniele, Marzouk, Youssef

Inference via low-dimensional couplings

arXiv.org Machine LearningNov-24-2017

We investigate the low-dimensional structure of deterministic transformations between random variables, i.e., transport maps between probability measures. In the context of statistics and machine learning, these transformations can be used to couple a tractable "reference" measure (e.g., a standard Gaussian) with a target measure of interest. Direct simulation from the desired measure can then be achieved by pushing forward reference samples through the map. Yet characterizing such a map---e.g., representing and evaluating it---grows challenging in high dimensions. The central contribution of this paper is to establish a link between the Markov properties of the target measure and the existence of low-dimensional couplings, induced by transport maps that are sparse and/or decomposable. Our analysis not only facilitates the construction of transformations in high-dimensional settings, but also suggests new inference methodologies for continuous non-Gaussian graphical models. For instance, in the context of nonlinear state-space models, we describe new variational algorithms for filtering, smoothing, and sequential parameter inference. These algorithms can be understood as the natural generalization---to the non-Gaussian case---of the square-root Rauch-Tung-Striebel Gaussian smoother.

artificial intelligence, machine learning, transport map, (15 more...)

1703.06131

Country: North America > United States > Massachusetts > Middlesex County (0.28)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Machine LearningNov-23-2017

Diversity-Promoting Bayesian Learning of Latent Variable Models

Xie, Pengtao, Zhu, Jun, Xing, Eric P.

To address three important issues involved in latent variable models (LVMs), including capturing infrequent patterns, achieving small-sized but expressive models and alleviating overfitting, several studies have been devoted to "diversifying" LVMs, which aim at encouraging the components in LVMs to be diverse. Most existing studies fall into a frequentist-style regularization framework, where the components are learned via point estimation. In this paper, we investigate how to "diversify" LVMs in the paradigm of Bayesian learning. We propose two approaches that have complementary advantages. One is to define a diversity-promoting mutual angular prior which assigns larger density to components with larger mutual angles and use this prior to affect the posterior via Bayes' rule. We develop two efficient approximate posterior inference algorithms based on variational inference and MCMC sampling. The other approach is to impose diversity-promoting regularization directly over the post-data distribution of components. We also extend our approach to "diversify" Bayesian nonparametric models where the number of components is infinite. A sampling algorithm based on slice sampling and Hamiltonian Monte Carlo is developed. We apply these methods to "diversify" Bayesian mixture of experts model and infinite latent feature model. Experiments on various datasets demonstrate the effectiveness and efficiency of our methods.

diversity-promoting bayesian learning, inference, vector, (13 more...)

1711.0877

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Jordan (0.05)
Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

@machinelearnbotNov-22-2017, 05:42:44 GMT

Bayesian Nonparametric Models

Bayesian Nonparametrics is a class of models with a potentially infinite number of parameters. High flexibility and expressive power of this approach enables better data modelling compared to parametric methods.

bayesian inference, bayesian nonparametric model, machine learning, (1 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Trajdos, Pawel, Kurzynski, Marek

Dynamic classifier chains for multi-label learning

In this paper, we deal with the task of building a dynamic ensemble of chain classifiers for multi-label classification. To do so, we proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model. Such modes allows anticipating the instance-specific chain order without a significant increase in computational burden. The proposed chain models are built using the Naive Bayes classifier and nearest neighbour approach as a base single-label classifiers. To take the benefits of the proposed algorithms, we developed a simple heuristic that allows the system to find relatively good label order. The heuristic sort labels according to the label-specific classification quality gained during the validation phase. The heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the above-mentioned heuristic is an efficient tool for building dynamic chain classifiers.

artificial intelligence, classifier, machine learning, (18 more...)

doi: 10.1007/978-3-030-33676-9_40

1710.07491

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

An Efficient ADMM Algorithm for Structural Break Detection in Multivariate Time Series

Tank, Alex, Fox, Emily B., Shojaie, Ali

We present an efficient alternating direction method of multipliers (ADMM) algorithm for segmenting a multivariate non-stationary time series with structural breaks into stationary regions. We draw from recent work where the series is assumed to follow a vector autoregressive model within segments and a convex estimation procedure may be formulated using group fused lasso penalties. Our ADMM approach first splits the convex problem into a global quadratic program and a simple group lasso proximal update. We show that the global problem may be parallelized over rows of the time dependent transition matrices and furthermore that each subproblem may be rewritten in a form identical to the log-likelihood of a Gaussian state space model. Consequently, we develop a Kalman smoothing algorithm to solve the global update in time linear in the length of the series.

artificial intelligence, bayesian inference, machine learning, (13 more...)

1711.08392

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Variational Bayesian Inference For A Scale Mixture Of Normal Distributions Handling Missing Data

Revillon, G., Djafari, A., Enderli, C.

In this paper, a scale mixture of Normal distributions model is developed for classification and clustering of data having outliers and missing values. The classification method, based on a mixture model, focuses on the introduction of latent variables that gives us the possibility to handle sensitivity of model to outliers and to allow a less restrictive modelling of missing data. Inference is processed through a Variational Bayesian Approximation and a Bayesian treatment is adopted for model learning, supervised classification and clustering.

artificial intelligence, machine learning, posterior distribution, (12 more...)

1711.08374

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Rawat, Ambrish, Wistuba, Martin, Nicolae, Maria-Irina

Adversarial Phenomenon in the Eyes of Bayesian Deep Learning

Deep Learning models are vulnerable to adversarial examples, i.e.\ images obtained via deliberate imperceptible perturbations, such that the model misclassifies them with high confidence. However, class confidence by itself is an incomplete picture of uncertainty. We therefore use principled Bayesian methods to capture model uncertainty in prediction for observing adversarial misclassification. We provide an extensive study with different Bayesian neural networks attacked in both white-box and black-box setups. The behaviour of the networks for noise, attacks and clean test data is compared. We observe that Bayesian neural networks are uncertain in their predictions for adversarial perturbations, a behaviour similar to the one observed for random Gaussian perturbations. Thus, we conclude that Bayesian neural networks can be considered for detecting adversarial examples.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1711.08244

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Information Technology (0.48)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sykulski, Adam M., Olhede, Sofia C., Lilly, Jonathan M., Guillaumin, Arthur P., Early, Jeffrey J.

The De-Biased Whittle Likelihood

The Whittle likelihood is a widely used and computationally efficient pseudo-likelihood. However, it is known to produce biased parameter estimates for large classes of models. We propose a method for de-biasing Whittle estimates for second-order stationary stochastic processes. The de-biased Whittle likelihood can be computed in the same $\mathcal{O}(n\log n)$ operations as the standard approach. We demonstrate the superior performance of the method in simulation studies and in application to a large-scale oceanographic dataset, where in both cases the de-biased approach reduces bias by up to two orders of magnitude, achieving estimates that are close to exact maximum likelihood, at a fraction of the computational cost. We prove that the method yields estimates that are consistent at an optimal convergence rate of $n^{-1/2}$, under weaker assumptions than standard theory, where we do not require that the power spectral density is continuous in frequency. We describe how the method can be easily combined with standard methods of bias reduction, such as tapering and differencing, to further reduce bias in parameter estimates.

artificial intelligence, likelihood, machine learning, (18 more...)

1605.06718

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Blanchard, Gilles, Deshmukh, Aniket Anand, Dogan, Urun, Lee, Gyemin, Scott, Clayton

Domain Generalization by Marginal Transfer Learning

arXiv.org Machine LearningNov-21-2017

Domain generalization is the problem of assigning class labels to an unlabeled test data set, given several labeled training data sets drawn from similar distributions. This problem arises in several applications where data distributions fluctuate because of biological, technical, or other sources of variation. We develop a distribution-free, kernel-based approach that predicts a classifier from the marginal distribution of features, by leveraging the trends present in related classification tasks. This approach involves identifying an appropriate reproducing kernel Hilbert space and optimizing a regularized empirical risk over the space. We present generalization error analysis, describe universal kernels, and establish universal consistency of the proposed methodology. Experimental results on synthetic data and three real data applications demonstrate the superiority of the method with respect to a pooling strategy.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1711.0791

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)