Directed Networks
Faster graphical model identification of tandem mass spectra using peptide word lattices
Wang, Shengjie, Halloran, John T., Bilmes, Jeff A., Noble, William S.
Liquid chromatography coupled with tandem mass spectrometry, also known as shotgun proteomics, is a widely-used high-throughput technology for identifying proteins in complex biological samples. Analysis of the tens of thousands of fragmentation spectra produced by a typical shotgun proteomics experiment begins by assigning to each observed spectrum the peptide hypothesized to be responsible for generating the spectrum, typically done by searching each spectrum against a database of peptides. We have recently described a machine learning method---Dynamic Bayesian Network for Rapid Identification of Peptides (DRIP)---that not only achieves state-of-the-art spectrum identification performance on a variety of datasets but also provides a trainable model capable of returning valuable auxiliary information regarding specific peptide-spectrum matches. In this work, we present two significant improvements to DRIP. First, we describe how to use word lattices, which are widely used in natural language processing, to significantly speed up DRIP's computations. To our knowledge, all existing shotgun proteomics search engines compute independent scores between a given observed spectrum and each possible candidate peptide from the database. The key idea of the word lattice is to represent the set of candidate peptides in a single data structure, thereby allowing sharing of redundant computations among the different candidates. We demonstrate that using lattices in conjunction with DRIP leads to speedups on the order of tens across yeast and worm data sets. Second, we introduce a variant of DRIP that uses a discriminative training framework, performing maximum mutual entropy estimation rather than maximum likelihood estimation. This modification improves DRIP's statistical power, enabling us to increase the number of identified spectrum at a 1% false discovery rate on yeast and worm data sets.
Sparse Estimation using Bayesian Hierarchical Prior Modeling for Real and Complex Linear Models
Pedersen, Niels Lovmand, Manchรณn, Carles Navarro, Badiu, Mihai-Alin, Shutin, Dmitriy, Fleury, Bernard Henri
In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex-valued models, this paper proposes a GSM model - the Bessel K model - that induces concave penalty functions for the estimation of complex sparse signals. The properties of the Bessel K model are analyzed when it is applied to Type I and Type II estimation. This analysis reveals that, by tuning the parameters of the mixing pdf different penalty functions are invoked depending on the estimation type used, the value of the noise variance, and whether real or complex signals are estimated. Using the Bessel K model, we derive a sparse estimator based on a modification of the expectation-maximization algorithm formulated for Type II estimation. The estimator includes as a special instance the algorithms proposed by Tipping and Faul [1] and by Babacan et al. [2]. Numerical results show the superiority of the proposed estimator over these state-of-the-art estimators in terms of convergence speed, sparseness, reconstruction error, and robustness in low and medium signal-to-noise ratio regimes.
On the Challenges of Physical Implementations of RBMs
Dumoulin, Vincent, Goodfellow, Ian J., Courville, Aaron, Bengio, Yoshua
Restricted Boltzmann machines (RBMs) are powerful machine learning models, but learning and some kinds of inference in the model require sampling-based approximations, which, in classical digital computers, are implemented using expensive MCMC. Physical computation offers the opportunity to reduce the cost of sampling by building physical systems whose natural dynamics correspond to drawing samples from the desired RBM distribution. Such a system avoids the burn-in and mixing cost of a Markov chain. However, hardware implementations of this variety usually entail limitations such as low-precision and limited range of the parameters and restrictions on the size and topology of the RBM. We conduct software simulations to determine how harmful each of these restrictions is. Our simulations are based on the D-Wave Two computer, but the issues we investigate arise in most forms of physical computation. Our findings suggest that designers of new physical computing hardware and algorithms for physical computers should focus their efforts on overcoming the limitations imposed by the topology restrictions of currently existing physical computers.
Non-parametric Bayesian Learning with Deep Learning Structure and Its Applications in Wireless Networks
Statistical models have been applied to the classification and prediction problems in machine learning and data analysis [1]. Some statistical methods make the hypothesis of mathematical models that are controlled by certain parameters to fit the latent structure of observed data [2]. The observed data are assumed to be generated by complex structures which have hierarchical layers and hidden causes [3]. One key challenge faced by modeling the data structure in this way is thus the determination of the numbers of layers and hidden variables. However, it is sometimes impractical and challenging to choose any fixed number for the model structure when making the hypothesis.
Bayesian matrix completion: prior specification
Alquier, Pierre, Cottet, Vincent, Chopin, Nicolas, Rousseau, Judith
Low-rank matrix estimation from incomplete measurements recently received increased attention due to the emergence of several challenging applications, such as recommender systems; see in particular the famous Netflix challenge. While the behaviour of algorithms based on nuclear norm minimization is now well understood, an as yet unexplored avenue of research is the behaviour of Bayesian algorithms in this context. In this paper, we briefly review the priors used in the Bayesian literature for matrix completion. A standard approach is to assign an inverse gamma prior to the singular values of a certain singular value decomposition of the matrix of interest; this prior is conjugate. However, we show that two other types of priors (again for the singular values) may be conjugate for this model: a gamma prior, and a discrete prior. Conjugacy is very convenient, as it makes it possible to implement either Gibbs sampling or Variational Bayes. Interestingly enough, the maximum a posteriori for these different priors is related to the nuclear norm minimization problems. We also compare all these priors on simulated datasets, and on the classical MovieLens and Netflix datasets.
Low-Rank Modeling and Its Applications in Image Analysis
Zhou, Xiaowei, Yang, Can, Zhao, Hongyu, Yu, Weichuan
Low-rank modeling generally refers to a class of methods that solve problems by representing variables of interest as low-rank matrices. It has achieved great success in various fields including computer vision, data mining, signal processing and bioinformatics. Recently, much progress has been made in theories, algorithms and applications of low-rank modeling, such as exact low-rank matrix recovery via convex programming and matrix completion applied to collaborative filtering. These advances have brought more and more attentions to this topic. In this paper, we review the recent advance of low-rank modeling, the state-of-the-art algorithms, and related applications in image analysis. We first give an overview to the concept of low-rank modeling and challenging problems in this area. Then, we summarize the models and algorithms for low-rank matrix recovery and illustrate their advantages and limitations with numerical experiments. Next, we introduce a few applications of low-rank modeling in the context of image analysis. Finally, we conclude this paper with some discussions.
Classification of Autism Spectrum Disorder Using Supervised Learning of Brain Connectivity Measures Extracted from Synchrostates
Jamal, Wasifa, Das, Saptarshi, Oprescu, Ioana-Anastasia, Maharatna, Koushik, Apicella, Fabio, Sicca, Federico
Objective. The paper investigates the presence of autism using the functional brain connectivity measures derived from electro-encephalogram (EEG) of children during face perception tasks. Approach. Phase synchronized patterns from 128-channel EEG signals are obtained for typical children and children with autism spectrum disorder (ASD). The phase synchronized states or synchrostates temporally switch amongst themselves as an underlying process for the completion of a particular cognitive task. We used 12 subjects in each group (ASD and typical) for analyzing their EEG while processing fearful, happy and neutral faces. The minimal and maximally occurring synchrostates for each subject are chosen for extraction of brain connectivity features, which are used for classification between these two groups of subjects. Among different supervised learning techniques, we here explored the discriminant analysis and support vector machine both with polynomial kernels for the classification task. Main results. The leave one out cross-validation of the classification algorithm gives 94.7% accuracy as the best performance with corresponding sensitivity and specificity values as 85.7% and 100% respectively. Significance. The proposed method gives high classification accuracies and outperforms other contemporary research results. The effectiveness of the proposed method for classification of autistic and typical children suggests the possibility of using it on a larger population to validate it for clinical practice.
Variational Bayes for Merging Noisy Databases
Broderick, Tamara, Steorts, Rebecca C.
Bayesian entity resolution merges together multiple, noisy databases and returns the minimal collection of unique individuals represented, together with their true, latent record values. Bayesian methods allow flexible generative models that share power across databases as well as principled quantification of uncertainty for queries of the final, resolved database. However, existing Bayesian methods for entity resolution use Markov monte Carlo method (MCMC) approximations and are too slow to run on modern databases containing millions or billions of records. Instead, we propose applying variational approximations to allow scalable Bayesian inference in these models. We derive a coordinate-ascent approximation for mean-field variational Bayes, qualitatively compare our algorithm to existing methods, note unique challenges for inference that arise from the expected distribution of cluster sizes in entity resolution, and discuss directions for future work in this domain.
A Fusion Approach for Efficient Human Skin Detection
Tan, Wei Ren, Chan, Chee Seng, Yogarajah, Pratheepan, Condell, Joan
A reliable human skin detection method that is adaptable to different human skin colours and illu- mination conditions is essential for better human skin segmentation. Even though different human skin colour detection solutions have been successfully applied, they are prone to false skin detection and are not able to cope with the variety of human skin colours across different ethnic. Moreover, existing methods require high computational cost. In this paper, we propose a novel human skin de- tection approach that combines a smoothed 2D histogram and Gaussian model, for automatic human skin detection in colour image(s). In our approach an eye detector is used to refine the skin model for a specific person. The proposed approach reduces computational costs as no training is required; and it improves the accuracy of skin detection despite wide variation in ethnicity and illumination. To the best of our knowledge, this is the first method to employ fusion strategy for this purpose. Qualitative and quantitative results on three standard public datasets and a comparison with state-of-the-art methods have shown the effectiveness and robustness of the proposed approach.
PAC-Bayesian AUC classification and scoring
Ridgway, James, Alquier, Pierre, Chopin, Nicolas, Liang, Feng
We develop a scoring and classification procedure based on the PAC-Bayesian approach and the AUC (Area Under Curve) criterion. We focus initially on the class of linear score functions. We derive PAC-Bayesian non-asymptotic bounds for two types of prior for the score parameters: a Gaussian prior, and a spike-and-slab prior; the latter makes it possible to perform feature selection. One important advantage of our approach is that it is amenable to powerful Bayesian computational tools. We derive in particular a Sequential Monte Carlo algorithm, as an efficient method which may be used as a gold standard, and an Expectation-Propagation algorithm, as a much faster but approximate method. We also extend our method to a class of non-linear score functions, essentially leading to a nonparametric procedure, by considering a Gaussian process prior.