Genre
Fast and Robust Recursive Algorithms for Separable Nonnegative Matrix Factorization
Gillis, Nicolas, Vavasis, Stephen A.
A hyperspectral image consists of a set of images taken at different wavelengths. It is acquired by measuring the spectral signature of each pixel present in the scene, that is, by measuring the reflectance (the fraction of the incident electromagnetic power that is reflected by a surface at a given wavelength) of each pixel at different wavelengths. One of the most important tasks in hyperspectral imaging is called unmixing. It requires the identification of the constitutive materials present in the image and estimation of their abundances in each pixel. The most widely used model is the linear mixing model: the spectral signature of each pixel results from the additive linear combination of the spectral signatures of the constitutive materials, called endmembers, where the weights of the linear combination correspond to the abundances of the different endmembers in that pixel.
Laplace approximation for logistic Gaussian process density estimation and regression
Riihimรคki, Jaakko, Vehtari, Aki
Logistic Gaussian process (LGP) priors provide a flexible alternative for modelling unknown densities. The smoothness properties of the density estimates can be controlled through the prior covariance structure of the LGP, but the challenge is the analytically intractable inference. In this paper, we present approximate Bayesian inference for LGP density estimation in a grid using Laplace's method to integrate over the non-Gaussian posterior distribution of latent function values and to determine the covariance function parameters with type-II maximum a posteriori (MAP) estimation. We demonstrate that Laplace's method with MAP is sufficiently fast for practical interactive visualisation of 1D and 2D densities. Our experiments with simulated and real 1D data sets show that the estimation accuracy is close to a Markov chain Monte Carlo approximation and state-of-the-art hierarchical infinite Gaussian mixture models. We also construct a reduced-rank approximation to speed up the computations for dense 2D grids, and demonstrate density regression with the proposed Laplace approach.
Parallel coordinate descent for the Adaboost problem
We design a randomised parallel version of Adaboost based on previous studies on parallel coordinate descent. The algorithm uses the fact that the logarithm of the exponential loss is a function with coordinate-wise Lipschitz continuous gradient, in order to define the step lengths. We provide the proof of convergence for this randomised Adaboost algorithm and a theoretical parallelisation speedup factor. We finally provide numerical examples on learning problems of various sizes that show that the algorithm is competitive with concurrent approaches, especially for large scale problems.
Bayesian Optimization With Censored Response Data
Hutter, Frank, Hoos, Holger, Leyton-Brown, Kevin
Bayesian optimization (BO) aims to minimize a given blackbox function using a model that is updated whenever new evidence about the function becomes available. Here, we address the problem of BO under partially right-censored response data, where in some evaluations we only obtain a lower bound on the function value. The ability to handle such response data allows us to adaptively censor costly function evaluations in minimization problems where the cost of a function evaluation corresponds to the function value. One important application giving rise to such censored data is the runtime-minimizing variant of the algorithm configuration problem: finding settings of a given parametric algorithm that minimize the runtime required for solving problem instances from a given distribution. We demonstrate that terminating slow algorithm runs prematurely and handling the resulting right-censored observations can substantially improve the state of the art in model-based algorithm configuration.
Learning Hidden Structures with Relational Models by Adequately Involving Rich Information in A Network
Fan, Xuhui, Da Xu, Richard Yi, Cao, Longbing, Song, Yin
Effectively modelling hidden structures in a network is very practical but theoretically challenging. Existing relational models only involve very limited information, namely the binary directional link data, embedded in a network to learn hidden networking structures. There is other rich and meaningful information (e.g., various attributes of entities and more granular information than binary elements such as "like" or "dislike") missed, which play a critical role in forming and understanding relations in a network. In this work, we propose an informative relational model (InfRM) framework to adequately involve rich information and its granularity in a network, including metadata information about each entity and various forms of link data. Firstly, an effective metadata information incorporation method is employed on the prior information from relational models MMSB and LFRM. This is to encourage the entities with similar metadata information to have similar hidden structures. Secondly, we propose various solutions to cater for alternative forms of link data. Substantial efforts have been made towards modelling appropriateness and efficiency, for example, using conjugate priors. We evaluate our framework and its inference algorithms in different datasets, which shows the generality and effectiveness of our models in capturing implicit structures in networks.
Copula Mixed-Membership Stochastic Blockmodel for Intra-Subgroup Correlations
Fan, Xuhui, Cao, Longbing, Da Xu, Richard Yi
The \emph{Mixed-Membership Stochastic Blockmodel (MMSB)} is a popular framework for modeling social network relationships. It can fully exploit each individual node's participation (or membership) in a social structure. Despite its powerful representations, this model makes an assumption that the distributions of relational membership indicators between two nodes are independent. Under many social network settings, however, it is possible that certain known subgroups of people may have high or low correlations in terms of their membership categories towards each other, and such prior information should be incorporated into the model. To this end, we introduce a \emph{Copula Mixed-Membership Stochastic Blockmodel (cMMSB)} where an individual Copula function is employed to jointly model the membership pairs of those nodes within the subgroup of interest. The model enables the use of various Copula functions to suit the scenario, while maintaining the membership's marginal distribution, as needed, for modeling membership indicators with other nodes outside of the subgroup of interest. We describe the proposed model and its inference algorithm in detail for both the finite and infinite cases. In the experiment section, we compare our algorithms with other popular models in terms of link prediction, using both synthetic and real world data.
Cross-lingual Pseudo-Projected Expectation Regularization for Weakly Supervised Learning
Wang, Mengqiu, Manning, Christopher D.
We consider a multilingual weakly supervised learning scenario where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide the learning in other languages. Past approaches project labels across bitext and use them as features or gold labels for training. We propose a new method that projects model expectations rather than labels, which facilities transfer of model uncertainty across language boundaries. We encode expectations as constraints and train a discriminative CRF model using Generalized Expectation Criteria (Mann and McCallum, 2010). Evaluated on standard Chinese-English and German-English NER datasets, our method demonstrates F1 scores of 64% and 60% when no labeled data is used. Attaining the same accuracy with supervised CRFs requires 12k and 1.5k labeled sentences. Furthermore, when combined with labeled examples, our method yields significant improvements over state-of-the-art supervised methods, achieving best reported numbers to date on Chinese OntoNotes and German CoNLL-03 datasets.
High dimensional Sparse Gaussian Graphical Mixture Model
This paper considers the problem of networks reconstruction from heterogeneous data using a Gaussian Graphical Mixture Model (GGMM). It is well known that parameter estimation in this context is challenging due to large numbers of variables coupled with the degeneracy of the likelihood. We propose as a solution a penalized maximum likelihood technique by imposing an $l_{1}$ penalty on the precision matrix. Our approach shrinks the parameters thereby resulting in better identifiability and variable selection. We use the Expectation Maximization (EM) algorithm which involves the graphical LASSO to estimate the mixing coefficients and the precision matrices. We show that under certain regularity conditions the Penalized Maximum Likelihood (PML) estimates are consistent. We demonstrate the performance of the PML estimator through simulations and we show the utility of our method for high dimensional data analysis in a genomic application.
Contraction Principle based Robust Iterative Algorithms for Machine Learning
Mitra, Rangeet, Mishra, Amit Kumar
Iterative algorithms are ubiquitous in the field of data mining. Widely known examples of such algorithms are the least mean square algorithm, backpropagation algorithm of neural networks. Our contribution in this paper is an improvement upon this iterative algorithms in terms of their respective performance metrics and robustness. This improvement is achieved by a new scaling factor which is multiplied to the error term. Our analysis shows that in essence, we are minimizing the corresponding LASSO cost function, which is the reason of its increased robustness. We also give closed form expressions for the number of iterations for convergence and the MSE floor of the original cost function for a minimum targeted value of the L1 norm. As a concluding theme based on the stochastic subgradient algorithm, we give a comparison between the well known Dantzig selector and our algorithm based on contraction principle. By these simulations we attempt to show the optimality of our approach for any widely used parent iterative optimization problem.
Narrowing the Gap: Random Forests In Theory and In Practice
Denil, Misha, Matheson, David, de Freitas, Nando
Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of random regression forests and prove that our algorithm is consistent. We also provide an empirical evaluation, comparing our algorithm and other theoretically tractable random forest models to the random forest algorithm used in practice. Our experiments provide insight into the relative importance of different simplifications that theoreticians have made to obtain tractable models for analysis.