AITopics

1808.07226

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Yang, Yun, Pati, Debdeep, Bhattacharya, Anirban

$\alpha$-Variational Inference with Statistical Guarantees

arXiv.org Machine LearningFeb-7-2018

We propose a family of variational approximations to Bayesian posterior distributions, called $\alpha$-VB, with provable statistical guarantees. The standard variational approximation is a special case of $\alpha$-VB with $\alpha=1$. When $\alpha \in(0,1]$, a novel class of variational inequalities are developed for linking the Bayes risk under the variational approximation to the objective function in the variational optimization problem, implying that maximizing the evidence lower bound in variational inference has the effect of minimizing the Bayes risk within the variational density family. Operating in a frequentist setup, the variational inequalities imply that point estimates constructed from the $\alpha$-VB procedure converge at an optimal rate to the true parameter in a wide range of problems. We illustrate our general theory with a number of examples, including the mean-field variational approximation to (low)-high-dimensional Bayesian linear regression with spike and slab priors, mixture of Gaussian models, latent Dirichlet allocation, and (mixture of) Gaussian variational approximation in regular parametric models.

approximation, artificial intelligence, machine learning, (17 more...)

1710.03266

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Mescheder, Lars M., Lorenz, Dirk A.

An extended Perona-Malik model based on probabilistic models

arXiv.org Machine LearningDec-19-2016

The Perona-Malik model has been very successful at restoring images from noisy input. In this paper, we reinterpret the Perona-Malik model in the language of Gaussian scale mixtures and derive some extensions of the model. Specifically, we show that the expectation-maximization (EM) algorithm applied to Gaussian scale mixtures leads to the lagged-diffusivity algorithm for computing stationary points of the Perona-Malik diffusion equations. Moreover, we show how mean field approximations to these Gaussian scale mixtures lead to a modification of the lagged-diffusivity algorithm that better captures the uncertainties in the restoration. Since this modification can be hard to compute in practice we propose relaxations to the mean field objective to make the algorithm computationally feasible. Our numerical experiments show that this modified lagged-diffusivity algorithm often performs better at restoring textured areas and fuzzy edges than the unmodified algorithm. As a second application of the Gaussian scale mixture framework, we show how an efficient sampling procedure can be obtained for the probabilistic model, making the computation of the conditional mean and other expectations algorithmically feasible. Again, the resulting algorithm has a strong resemblance to the lagged-diffusivity algorithm. Finally, we show that a probabilistic version of the Mumford-Shah segementation model can be obtained in the same framework with a discrete edge-prior.

artificial intelligence, machine learning, perona-malik model, (19 more...)

1612.06176

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Takahashi, Chako, Yasuda, Muneki

Mean-Field Inference in Gaussian Restricted Boltzmann Machine

arXiv.org Machine LearningMar-17-2016

A Gaussian restricted Boltzmann machine (GRBM) is a Boltzmann machine defined on a bipartite graph and is an extension of usual restricted Boltzmann machines. A GRBM consists of two different layers: a visible layer composed of continuous visible variables and a hidden layer composed of discrete hidden variables. In this paper, we derive two different inference algorithms for GRBMs based on the naive mean-field approximation (NMFA). One is an inference algorithm for whole variables in a GRBM, and the other is an inference algorithm for partial variables in a GBRBM. We compare the two methods analytically and numerically and show that the latter method is better.

artificial intelligence, deep learning, machine learning, (16 more...)

doi: 10.7566/JPSJ.85.034001

1512.00927

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Wang, Peng, Shen, Chunhua, Hengel, Anton van den

Efficient SDP Inference for Fully-connected CRFs Based on Low-rank Decomposition

arXiv.org Machine LearningApr-7-2015

Conditional Random Fields (CRF) have been widely used in a variety of computer vision tasks. Conventional CRFs typically define edges on neighboring image pixels, resulting in a sparse graph such that efficient inference can be performed. However, these CRFs fail to model long-range contextual relationships. Fully-connected CRFs have thus been proposed. While there are efficient approximate inference methods for such CRFs, usually they are sensitive to initialization and make strong assumptions. In this work, we develop an efficient, yet general algorithm for inference on fully-connected CRFs. The algorithm is based on a scalable SDP algorithm and the low- rank approximation of the similarity/kernel matrix. The core of the proposed algorithm is a tailored quasi-Newton method that takes advantage of the low-rank matrix approximation when solving the specialized SDP dual problem. Experiments demonstrate that our method can be applied on fully-connected CRFs that cannot be solved previously, such as pixel-level image co-segmentation.

approximation, artificial intelligence, machine learning, (18 more...)

doi: 10.1109/CVPR.2015.7298942

1504.01492

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Xing, Eric P., Jordan, Michael I., Russell, Stuart

A Generalized Mean Field Algorithm for Variational Inference in Exponential Families

arXiv.org Machine LearningOct-19-2012

The mean field methods, which entail approximating intractable probability distributions variationally with distributions from a tractable family, enjoy high efficiency, guaranteed convergence, and provide lower bounds on the true likelihood. But due to requirement for model-specific derivation of the optimization equations and unclear inference quality in various models, it is not widely used as a generic approximate inference algorithm. In this paper, we discuss a generalized mean field theory on variational approximation to a broad class of intractable distributions using a rich set of tractable distributions via constrained optimization over distribution spaces. We present a class of generalized mean field (GMF) algorithms for approximate inference in complex exponential family models, which entails limiting the optimization over the class of cluster-factorizable distributions. GMF is a generic method requiring no model-specific derivations. It factors a complex model into a set of disjoint variable clusters, and uses a set of canonical fix-point equations to iteratively update the cluster distributions, and converge to locally optimal cluster marginals that preserve the original dependency structure within each cluster, hence, fully decomposed the overall inference problem. We empirically analyzed the effect of different tractable family (clusters of different granularity) on inference quality, and compared GMF with BP on several canonical models. Possible extension to higher-order MF approximation is also discussed.

algorithm, artificial intelligence, machine learning, (19 more...)

1212.2512

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)

Neural Information Processing SystemsDec-31-2006

Affine Structure From Sound

Thrun, Sebastian

We consider the problem of localizing a set of microphones together with a set of external acoustic events (e.g., hand claps), emitted at unknown times and unknown locations. We propose a solution that approximates this problem under a far field approximation defined in the calculus of affine geometry, and that relies on singular value decomposition (SVD) to recover the affine structure of the problem. We then define low-dimensional optimization techniques for embedding the solution into Euclidean geometry, and further techniques for recovering the locations and emission times of the acoustic events. The approach is useful for the calibration of ad-hoc microphone arrays and sensor networks.

acoustic event, gradient descent, sensor, (12 more...)

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Communications > Networks > Sensor Networks (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Neural Information Processing SystemsDec-31-2006

Affine Structure From Sound

Thrun, Sebastian

acoustic event, gradient descent, sensor, (12 more...)

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Communications > Networks > Sensor Networks (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Neural Information Processing SystemsDec-31-2006

Affine Structure From Sound

Thrun, Sebastian

The approach is useful for the calibration of ad-hoc microphone arrays and sensor networks.

acoustic event, artificial intelligence, machine learning, (16 more...)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.51)
Information Technology > Communications > Networks > Sensor Networks (0.50)

Downs, Oliver B., MacKay, David J. C., Lee, Daniel D.

The Nonnegative Boltzmann Machine

Neural Information Processing SystemsDec-31-2000

The nonnegative Boltzmann machine (NNBM) is a recurrent neural network model that can describe multimodal nonnegative data. Application of maximum likelihood estimation to this model gives a learning rule that is analogous to the binary Boltzmann machine. We examine the utility of the mean field approximation for the NNBM, and describe how Monte Carlo sampling techniques can be used to learn its parameters. Reflective slice sampling is particularly well-suited for this distribution, and can efficiently be implemented to sample the distribution. We illustrate learning of the NNBM on a transiationally invariant distribution, as well as on a generative model for images of human faces. Introduction The multivariate Gaussian is the most elementary distribution used to model generic data. It represents the maximum entropy distribution under the constraint that the mean and covariance matrix of the distribution match that of the data. For the case of binary data, the maximum entropy distribution that matches the first and second order statistics of the data is given by the Boltzmann machine [1].

boltzmann machine, field approximation, nnbm distribution, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)