Collaborating Authors

Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification Artificial Intelligence

Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse performance than using a single kernel. There are two possible reasons for the failure: (i) most existing MKL methods assume that the optimal kernel is a linear combination of base kernels, which may not hold true; and (ii) some kernel weights are inappropriately assigned due to noises and carelessly designed algorithms. In this paper, we propose a novel MKL framework by following two intuitive assumptions: (i) each kernel is a perturbation of the consensus kernel; and (ii) the kernel that is close to the consensus kernel should be assigned a large weight. Impressively, the proposed method can automatically assign an appropriate weight to each kernel without introducing additional parameters, as existing methods do. The proposed framework is integrated into a unified framework for graph-based clustering and semi-supervised classification. We have conducted experiments on multiple benchmark datasets and our empirical results verify the superiority of the proposed framework.

Sequential Randomized Matrix Factorization for Gaussian Processes: Efficient Predictions and Hyper-parameter Optimization Machine Learning

This paper presents a sequential randomized lowrank matrix factorization approach for incrementally predicting values of an unknown function at test points using the Gaussian Processes framework. It is well-known that in the Gaussian processes framework, the computational bottlenecks are the inversion of the (regularized) kernel matrix and the computation of the hyper-parameters defining the kernel. The main contributions of this paper are two-fold. First, we formalize an approach to compute the inverse of the kernel matrix using randomized matrix factorization algorithms in a streaming scenario, i.e., data is generated incrementally over time. The metrics of accuracy and computational efficiency of the proposed method are compared against a batch approach based on use of randomized matrix factorization and an existing streaming approach based on approximating the Gaussian process by a finite set of basis vectors. Second, we extend the sequential factorization approach to a class of kernel functions for which the hyperparameters can be efficiently optimized. All results are demonstrated on two publicly available datasets.

Multi-label Multiple Kernel Learning

Neural Information Processing Systems

We present a multi-label multiple kernel learning (MKL) formulation, in which the data are embedded into a low-dimensional space directed by the instance-label correlations encoded into a hypergraph. We formulate the problem in the kernel-induced feature space and propose to learn the kernel matrix as a linear combination of a given collection of kernel matrices in the MKL framework. The proposed learning formulation leads to a non-smooth min-max problem, and it can be cast into a semi-infinite linear program (SILP). We further propose an approximate formulation with a guaranteed error bound which involves an unconstrained and convex optimization problem. In addition, we show that the objective function of the approximate formulation is continuously differentiable with Lipschitz gradient, and hence existing methods can be employed to compute the optimal solution efficiently.

Optimizing Kernel Machines using Deep Learning Machine Learning

Building highly non-linear and non-parametric models is central to several state-of-the-art machine learning systems. Kernel methods form an important class of techniques that induce a reproducing kernel Hilbert space (RKHS) for inferring non-linear models through the construction of similarity functions from data. These methods are particularly preferred in cases where the training data sizes are limited and when prior knowledge of the data similarities is available. Despite their usefulness, they are limited by the computational complexity and their inability to support end-to-end learning with a task-specific objective. On the other hand, deep neural networks have become the de facto solution for end-to-end inference in several learning paradigms. In this article, we explore the idea of using deep architectures to perform kernel machine optimization, for both computational efficiency and end-to-end inferencing. To this end, we develop the DKMO (Deep Kernel Machine Optimization) framework, that creates an ensemble of dense embeddings using Nystrom kernel approximations and utilizes deep learning to generate task-specific representations through the fusion of the embeddings. Intuitively, the filters of the network are trained to fuse information from an ensemble of linear subspaces in the RKHS. Furthermore, we introduce the kernel dropout regularization to enable improved training convergence. Finally, we extend this framework to the multiple kernel case, by coupling a global fusion layer with pre-trained deep kernel machines for each of the constituent kernels. Using case studies with limited training data, and lack of explicit feature sources, we demonstrate the effectiveness of our framework over conventional model inferencing techniques.

Gene-Gene association for Imaging Genetics Data using Robust Kernel Canonical Correlation Analysis Machine Learning

In genome-wide interaction studies, to detect gene-gene interactions, most methods are divided into two folds: single nucleotide polymorphisms (SNP) based and gene-based methods. Basically, the methods based on the gene are more effective than the methods based on a single SNP. Recent years, while the kernel canonical correlation analysis (Classical kernel CCA) based U statistic (KCCU) has proposed to detect the nonlinear relationship between genes. To estimate the variance in KCCU, they have used resampling based methods which are highly computationally intensive. In addition, classical kernel CCA is not robust to contaminated data. We, therefore, first discuss robust kernel mean element, the robust kernel covariance, and cross-covariance operators. Second, we propose a method based on influence function to estimate the variance of the KCCU. Third, we propose a nonparametric robust KCCU method based on robust kernel CCA, which is designed for contaminated data and less sensitive to noise than classical kernel CCA. Finally, we investigate the proposed methods to synthesized data and imaging genetic data set. Based on gene ontology and pathway analysis, the synthesized and genetics analysis demonstrate that the proposed robust method shows the superior performance of the state-of-the-art methods.