Bayesian Inference
A Bayesian Framework for Community Detection Integrating Content and Link
Yang, Tianbao, Jin, Rong, Chi, Yun, Zhu, Shenghuo
This paper addresses the problem of community detection in networked data that combines link and content analysis. Most existing work combines link and content information by a generative model. There are two major shortcomings with the existing approaches. First, they assume that the probability of creating a link between two nodes is determined only by the community memberships of the nodes; however other factors (e.g. popularity) could also affect the link pattern. Second, they use generative models to model the content of individual nodes, whereas these generative models are vulnerable to the content attributes that are irrelevant to communities. We propose a Bayesian framework for combining link and content information for community detection that explicitly addresses these shortcomings. A new link model is presented that introduces a random variable to capture the node popularity when deciding the link between two nodes; a discriminative model is used to determine the community membership of a node by its content. An approximate inference algorithm is presented for efficient Bayesian inference. Our empirical study shows that the proposed framework outperforms several state-of-theart approaches in combining link and content information for community detection.
Most Relevant Explanation: Properties, Algorithms, and Evaluations
Yuan, Changhe, Liu, Xiaolu, Lu, Tsai-Ching, Lim, Heejin
Most Relevant Explanation (MRE) is a method for finding multivariate explanations for given evidence in Bayesian networks [12]. This paper studies the theoretical properties of MRE and develops an algorithm for finding multiple top MRE solutions. Our study shows that MRE relies on an implicit soft relevance measure in automatically identifying the most relevant target variables and pruning less relevant variables from an explanation. The soft measure also enables MRE to capture the intuitive phenomenon of explaining away encoded in Bayesian networks. Furthermore, our study shows that the solution space of MRE has a special lattice structure which yields interesting dominance relations among the solutions. A K-MRE algorithm based on these dominance relations is developed for generating a set of top solutions that are more representative. Our empirical results show that MRE methods are promising approaches for explanation in Bayesian networks.
Bayesian clustering in decomposable graphs
This paper is concerned with the inference of the conditional independence graph G of a multivariate random vector Y of dimension n, a problem sometimes referred to as structure learning. We focus here on undirected decomposable graphs, whose popularity is mainly due to the tractable factorization they allow for the likelihood ([9, 20]); related work for directed graphical models can be found in [18]. Learning the conditional 1 independence graph G is an onerous task due to the large number of graphs on a set of n nodes, or variables. It is possible using optimization methods to find the graph which best fits the data according to some metric [23, 30, 13]; alternatively Bayesian model averaging may be used to accommodate for uncertainty in the estimated graph, or maximum a posteriori estimation may be used to select a given model from the posterior over graphs. Such an approach relies on a prior distribution ฯ(G) over the set of decomposable graphs of a given size; through Bayes theorem, this prior is updated based on the data to give an a posteriori estimate of the distribution over graphs.
Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches
Bioucas-Dias, Josรฉ M., Plaza, Antonio, Dobigeon, Nicolas, Parente, Mario, Du, Qian, Gader, Paul, Chanussot, Jocelyn
Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.
A Privacy-Aware Bayesian Approach for Combining Classifier and Cluster Ensembles
Acharya, Ayan, Hruschka, Eduardo R., Ghosh, Joydeep
This paper introduces a privacy-aware Bayesian approach that combines ensembles of classifiers and clusterers to perform semi-supervised and transductive learning. We consider scenarios where instances and their classification/clustering results are distributed across different data sites and have sharing restrictions. As a special case, the privacy aware computation of the model when instances of the target data are distributed across different data sites, is also discussed. Experimental results show that the proposed approach can provide good classification accuracies while adhering to the data/model sharing constraints.
The Discrete Infinite Logistic Normal Distribution
Paisley, John, Wang, Chong, Blei, David
We present the discrete infinite logistic normal distribution (DILN), a Bayesian nonparametric prior for mixed membership models. DILN is a generalization of the hierarchical Dirichlet process (HDP) that models correlation structure between the weights of the atoms at the group level. We derive a representation of DILN as a normalized collection of gamma-distributed random variables, and study its statistical properties. We consider applications to topic modeling and derive a variational inference algorithm for approximate posterior inference. We study the empirical performance of the DILN topic model on four corpora, comparing performance with the HDP and the correlated topic model (CTM). To deal with large-scale data sets, we also develop an online inference algorithm for DILN and compare with online HDP and online LDA on the Nature magazine, which contains approximately 350,000 articles.
EP-GIG Priors and Applications in Bayesian Sparse Learning
Zhang, Zhihua, Wang, Shusen, Liu, Dehua, Jordan, Michael I.
In this paper we propose a novel framework for the construction of sparsity-inducing priors. In particular, we define such priors as a mixture of exponential power distributions with a generalized inverse Gaussian density (EP-GIG). EP-GIG is a variant of generalized hyperbolic distributions, and the special cases include Gaussian scale mixtures and Laplace scale mixtures. Furthermore, Laplace scale mixtures can subserve a Bayesian framework for sparse learning with nonconvex penalization. The densities of EP-GIG can be explicitly expressed. Moreover, the corresponding posterior distribution also follows a generalized inverse Gaussian distribution. These properties lead us to EM algorithms for Bayesian sparse learning. We show that these algorithms bear an interesting resemblance to iteratively re-weighted $\ell_2$ or $\ell_1$ methods. In addition, we present two extensions for grouped variable selection and logistic regression.
Kernels for Vector-Valued Functions: a Review
Alvarez, Mauricio A., Rosasco, Lorenzo, Lawrence, Neil D.
Kernel methods are among the most popular techniques in machine learning. From a frequentist/discriminative perspective they play a central role in regularization theory as they provide a natural choice for the hypotheses space and the regularization functional through the notion of reproducing kernel Hilbert spaces. From a Bayesian/generative perspective they are the key in the context of Gaussian processes, where the kernel function is also known as the covariance function. Traditionally, kernel methods have been used in supervised learning problem with scalar outputs and indeed there has been a considerable amount of work devoted to designing and learning kernels. More recently there has been an increasing interest in methods that deal with multiple outputs, motivated partly by frameworks like multitask learning. In this paper, we review different methods to design or learn valid kernel functions for multiple outputs, paying particular attention to the connection between probabilistic and functional methods.
Probabilistic Latent Tensor Factorization Model for Link Pattern Prediction in Multi-relational Networks
Gao, Sheng, Denoyer, Ludovic, Gallinari, Patrick
This paper aims at the problem of link pattern prediction in collections of objects connected by multiple relation types, where each type may play a distinct role. While common link analysis models are limited to single-type link prediction, we attempt here to capture the correlations among different relation types and reveal the impact of various relation types on performance quality. For that, we define the overall relations between object pairs as a \textit{link pattern} which consists in interaction pattern and connection structure in the network, and then use tensor formalization to jointly model and predict the link patterns, which we refer to as \textit{Link Pattern Prediction} (LPP) problem. To address the issue, we propose a Probabilistic Latent Tensor Factorization (PLTF) model by introducing another latent factor for multiple relation types and furnish the Hierarchical Bayesian treatment of the proposed probabilistic model to avoid overfitting for solving the LPP problem. To learn the proposed model we develop an efficient Markov Chain Monte Carlo sampling method. Extensive experiments are conducted on several real world datasets and demonstrate significant improvements over several existing state-of-the-art methods.
Concept Modeling with Superwords
El-Arini, Khalid, Fox, Emily B., Guestrin, Carlos
In information retrieval, a fundamental goal is to transform a document into concepts that are representative of its content. The term "representative" is in itself challenging to define, and various tasks require different granularities of concepts. In this paper, we aim to model concepts that are sparse over the vocabulary, and that flexibly adapt their content based on other relevant semantic information such as textual structure or associated image features. We explore a Bayesian nonparametric model based on nested beta processes that allows for inferring an unknown number of strictly sparse concepts. The resulting model provides an inherently different representation of concepts than a standard LDA (or HDP) based topic model, and allows for direct incorporation of semantic features. We demonstrate the utility of this representation on multilingual blog data and the Congressional Record.