Goto

Collaborating Authors

 Country


Computational Protein Design Using AND/OR Branch-and-Bound Search

arXiv.org Artificial Intelligence

The computation of the global minimum energy conformation (GMEC) is an important and challenging topic in structure-based computational protein design. In this paper, we propose a new protein design algorithm based on the AND/OR branch-and-bound (AOBB) search, which is a variant of the traditional branch-and-bound search algorithm, to solve this combinatorial optimization problem. By integrating with a powerful heuristic function, AOBB is able to fully exploit the graph structure of the underlying residue interaction network of a backbone template to significantly accelerate the design process. Tests on real protein data show that our new protein design algorithm is able to solve many prob- lems that were previously unsolvable by the traditional exact search algorithms, and for the problems that can be solved with traditional provable algorithms, our new method can provide a large speedup by several orders of magnitude while still guaranteeing to find the global minimum energy conformation (GMEC) solution.


The Fast Convergence of Incremental PCA

arXiv.org Machine Learning

We consider a situation in which we see samples in $\mathbb{R}^d$ drawn i.i.d. from some distribution with mean zero and unknown covariance A. We wish to compute the top eigenvector of A in an incremental fashion - with an algorithm that maintains an estimate of the top eigenvector in O(d) space, and incrementally adjusts the estimate with each new data point that arrives. Two classical such schemes are due to Krasulina (1969) and Oja (1983). We give finite-sample convergence rates for both.


Holographic Graph Neuron: a Bio-Inspired Architecture for Pattern Processing

arXiv.org Artificial Intelligence

--This article proposes the use of V ector Symbolic Architectures for implementing Hierarchical Graph Neuron, an architecture for memorizing patterns of generic sensor stimuli. The adoption of a V ector Symbolic representation ensures a one-layered design for the approach, while maintaining the previously reported properties and performance characteristics of Hierarchical Graph Neuron, and also improving the noise resistance of the architecture. The proposed architecture enables a linear (with respect to the number of stored entries) time search for an arbitrary sub-pattern. RAPH Neuron (GN) is an approach for memorizing patterns of generic sensor stimuli for later template matching. It is based on the hypothesis that a better associative memory resource can be created by changing the emphasis from high speed sequential CPU processing to parallel network-centric processing [2], [3]. In contrast to contemporary machine learning approaches, GN allows introduction of new patterns in the learning set without the need for retraining. Whilst doing so, it exhibits a high level of scalability i.e. its performance and accuracy do not degrade as the number of stored patterns increases over time. V ector Symbolic Architectures (VSA) [4] are a bio-inspired method of representing concepts and their meaning for modeling cognitive reasoning. It exhibits a set of unique properties which make it suitable for implementation of artificial general intelligence [5], [6], [7], and so, creation of complex systems for sensing and pattern recognition without reliance on complex computation. In the biological world, extremely successful applications of these approaches can be found.


Dirichlet Process Parsimonious Mixtures for clustering

arXiv.org Machine Learning

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models.


On Generalizing the C-Bound to the Multiclass and Multi-label Settings

arXiv.org Machine Learning

The C-bound, introduced in Lacasse et al., gives a tight upper bound on the risk of a binary majority vote classifier. In this work, we present a first step towards extending this work to more complex outputs, by providing generalizations of the C-bound to the multiclass and multi-label settings.


An Improvement to the Domain Adaptation Bound in a PAC-Bayesian context

arXiv.org Machine Learning

This paper provides a theoretical analysis of domain adaptation based on the PAC-Bayesian theory. We propose an improvement of the previous domain adaptation bound obtained by Germain et al. in two ways. We first give another generalization bound tighter and easier to interpret. Moreover, we provide a new analysis of the constant term appearing in the bound that can be of high interest for developing new algorithmic solutions.


Random Bits Regression: a Strong General Predictor for Big Data

arXiv.org Machine Learning

To improve accuracy and speed of regressions and classifications, we present a data-based prediction method, Random Bits Regression (RBR). This method first generates a large number of random binary intermediate/derived features based on the original input matrix, and then performs regularized linear/logistic regression on those intermediate/derived features to predict the outcome. Benchmark analyses on a simulated dataset, UCI machine learning repository datasets and a GWAS dataset showed that RBR outperforms other popular methods in accuracy and robustness. RBR (available on https://sourceforge.net/projects/rbr/) is very fast and requires reasonable memories, therefore, provides a strong, robust and fast predictor in the big data era.


Combined modeling of sparse and dense noise for improvement of Relevance Vector Machine

arXiv.org Machine Learning

Using a Bayesian approach, we consider the problem of recovering sparse signals under additive sparse and dense noise. Typically, sparse noise models outliers, impulse bursts or data loss. To handle sparse noise, existing methods simultaneously estimate the sparse signal of interest and the sparse noise of no interest. For estimating the sparse signal, without the need of estimating the sparse noise, we construct a robust Relevance Vector Machine (RVM). In the RVM, sparse noise and ever present dense noise are treated through a combined noise model. The precision of combined noise is modeled by a diagonal matrix. We show that the new RVM update equations correspond to a non-symmetric sparsity inducing cost function. Further, the combined modeling is found to be computationally more efficient. We also extend the method to block-sparse signals and noise with known and unknown block structures. Through simulations, we show the performance and computation efficiency of the new RVM in several applications: recovery of sparse and block sparse signals, housing price prediction and image denoising.


A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning

arXiv.org Machine Learning

Learning sparse combinations is a frequent theme in machine learning. In this paper, we study its associated optimization problem in the distributed setting where the elements to be combined are not centrally located but spread over a network. We address the key challenges of balancing communication costs and optimization errors. To this end, we propose a distributed Frank-Wolfe (dFW) algorithm. We obtain theoretical guarantees on the optimization error $\epsilon$ and communication cost that do not depend on the total number of combining elements. We further show that the communication cost of dFW is optimal by deriving a lower-bound on the communication cost required to construct an $\epsilon$-approximate solution. We validate our theoretical analysis with empirical studies on synthetic and real-world data, which demonstrate that dFW outperforms both baselines and competing methods. We also study the performance of dFW when the conditions of our analysis are relaxed, and show that dFW is fairly robust.


SPRITE: A Response Model For Multiple Choice Testing

arXiv.org Machine Learning

Item response theory (IRT) models for categorical response data are widely used in the analysis of educational data, computerized adaptive testing, and psychological surveys. However, most IRT models rely on both the assumption that categories are strictly ordered and the assumption that this ordering is known a priori. These assumptions are impractical in many real-world scenarios, such as multiple-choice exams where the levels of incorrectness for the distractor categories are often unknown. While a number of results exist on IRT models for unordered categorical data, they tend to have restrictive modeling assumptions that lead to poor data fitting performance in practice. Furthermore, existing unordered categorical models have parameters that are difficult to interpret. In this work, we propose a novel methodology for unordered categorical IRT that we call SPRITE (short for stochastic polytomous response item model) that: (i) analyzes both ordered and unordered categories, (ii) offers interpretable outputs, and (iii) provides improved data fitting compared to existing models. We compare SPRITE to existing item response models and demonstrate its efficacy on both synthetic and real-world educational datasets.