Country
A Spectral Series Approach to High-Dimensional Nonparametric Regression
A key question in modern statistics is how to make fast and reliable inferences for complex, high-dimensional data. While there has been much interest in sparse techniques, current methods do not generalize well to data with nonlinear structure. In this work, we present an orthogonal series estimator for predictors that are complex aggregate objects, such as natural images, galaxy spectra, trajectories, and movies. Our series approach ties together ideas from kernel machine learning, and Fourier methods. We expand the unknown regression on the data in terms of the eigenfunctions of a kernel-based operator, and we take advantage of orthogonality of the basis with respect to the underlying data distribution, P, to speed up computations and tuning of parameters. If the kernel is appropriately chosen, then the eigenfunctions adapt to the intrinsic geometry and dimension of the data. We provide theoretical guarantees for a radial kernel with varying bandwidth, and we relate smoothness of the regression function with respect to P to sparsity in the eigenbasis. Finally, using simulated and real-world data, we systematically compare the performance of the spectral series approach with classical kernel smoothing, k-nearest neighbors regression, kernel ridge regression, and state-of-the-art manifold and local regression methods.
Greedy Deep Dictionary Learning
Tariyal, Snigdha, Majumdar, Angshul, Singh, Richa, Vatsa, Mayank
In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all.
Dimensionality Reduction via Regression in Hyperspectral Imagery
Laparra, Valero, Malo, Jesus, Camps-Valls, Gustau
This paper introduces a new unsupervised method for dimensionality reduction via regression (DRR). The algorithm belongs to the family of invertible transforms that generalize Principal Component Analysis (PCA) by using curvilinear instead of linear features. DRR identifies the nonlinear features through multivariate regression to ensure the reduction in redundancy between he PCA coefficients, the reduction of the variance of the scores, and the reduction in the reconstruction error. More importantly, unlike other nonlinear dimensionality reduction methods, the invertibility, volume-preservation, and straightforward out-of-sample extension, makes DRR interpretable and easy to apply. The properties of DRR enable learning a more broader class of data manifolds than the recently proposed Non-linear Principal Components Analysis (NLPCA) and Principal Polynomial Analysis (PPA). We illustrate the performance of the representation in reducing the dimensionality of remote sensing data. In particular, we tackle two common problems: processing very high dimensional spectral information such as in hyperspectral image sounding data, and dealing with spatial-spectral image patches of multispectral images. Both settings pose collinearity and ill-determination problems. Evaluation of the expressive power of the features is assessed in terms of truncation error, estimating atmospheric variables, and surface land cover classification error. Results show that DRR outperforms linear PCA and recently proposed invertible extensions based on neural networks (NLPCA) and univariate regressions (PPA).
Principal Polynomial Analysis
Laparra, Valero, Jiménez, Sandra, Tuia, Devis, Camps-Valls, Gustau, Malo, Jesús
This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.
Mapping Tractography Across Subjects
Nguyen, Thien Bao, Olivetti, Emanuele, Avesani, Paolo
Diffusion magnetic resonance imaging (dMRI) and tractography provide means to study the anatomical structures within the white matter of the brain. When studying tractography data across subjects, it is usually necessary to align, i.e. to register, tractographies together. This registration step is most often performed by applying the transformation resulting from the registration of other volumetric images (T1, FA). In contrast with registration methods that "transform" tractographies, in this work, we try to find which streamline in one tractography correspond to which streamline in the other tractography, without any transformation. In other words, we try to find a "mapping" between the tractographies. We propose a graph-based solution for the tractography mapping problem and we explain similarities and differences with the related well-known graph matching problem. Specifically, we define a loss function based on the pairwise streamline distance and reformulate the mapping problem as combinatorial optimization of that loss function. We show preliminary promising results where we compare the proposed method, implemented with simulated annealing, against a standard registration techniques in a task of segmentation of the corticospinal tract.
Kernels for sequentially ordered data
Király, Franz J, Oberhauser, Harald
We present a novel framework for kernel learning with sequential data of any kind, such as time series, sequences of graphs, or strings. Our approach is based on signature features which can be seen as an ordered variant of sample (cross-)moments; it allows to obtain a "sequentialized" version of any static kernel. The sequential kernels are efficiently computable for discrete sequences and are shown to approximate a continuous moment form in a sampling sense. A number of known kernels for sequences arise as "sequentializations" of suitable static kernels: string kernels may be obtained as a special case, and alignment kernels are closely related up to a modification that resolves their open non-definiteness issue. Our experiments indicate that our signature-based sequential kernel framework may be a promising approach to learning with sequential data, such as time series, that allows to avoid extensive manual pre-processing.
Early Predictions of Movie Success: the Who, What, and When of Profitability
This paper proposes a decision support system to aid movie investment decisions at the early stage of movie productions. The system predicts the success of a movie based on its profitability by leveraging historical data from various sources. Using social network analysis and text mining techniques, the system automatically extracts several groups of features, including "who" are on the cast, "what" a movie is about, "when" a movie will be released, as well as "hybrid" features that match "who" with "what", and "when" with "what". Experiment results with movies during an 11-year period showed that the system outperforms benchmark methods by a large margin in predicting movie profitability. Novel features we proposed also made great contributions to the prediction. In addition to designing a decision support system with practical utilities, our analysis of key factors for movie profitability may also have implications for theoretical research on team performance and the success of creative work.
High Dimensional Data Modeling Techniques for Detection of Chemical Plumes and Anomalies in Hyperspectral Images and Movies
Yi, null, Wang, null, Chen, Guangliang, Maggioni, Mauro
We briefly review recent progress in techniques for modeling and analyzing hyperspectral images and movies, in particular for detecting plumes of both known and unknown chemicals. For detecting chemicals of known spectrum, we extend the technique of using a single subspace for modeling the background to a "mixture of subspaces" model to tackle more complicated background. Furthermore, we use partial least squares regression on a resampled training set to boost performance. For the detection of unknown chemicals we view the problem as an anomaly detection problem, and use novel estimators with low-sampled complexity for intrinsically low-dimensional data in high-dimensions that enable us to model the "normal" spectra and detect anomalies. We apply these algorithms to benchmark data sets made available by the Automated Target Detection program co-funded by NSF, DTRA and NGA, and compare, when applicable, to current state-of-the-art algorithms, with favorable results.
Robustness Analysis of Preconditioned Successive Projection Algorithm for General Form of Separable NMF Problem
The successive projection algorithm (SPA) has been known to work well for separable nonnegative matrix factorization (NMF) problems arising in applications, such as topic extraction from documents and endmember detection in hyperspectral images. One of the reasons is in that the algorithm is robust to noise. Gillis and Vavasis showed in [SIAM J. Optim., 25(1), pp. 677-698, 2015] that a preconditioner can further enhance its noise robustness. The proof rested on the condition that the dimension $d$ and factorization rank $r$ in the separable NMF problem coincide with each other. However, it may be unrealistic to expect that the condition holds in separable NMF problems appearing in actual applications; in such problems, $d$ is usually greater than $r$. This paper shows, without the condition $d=r$, that the preconditioned SPA is robust to noise.
Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation
Saade, Alaa, Krzakala, Florent, Zdeborová, Lenka
The completion of low rank matrices from few entries is a task with many practical applications. We consider here two aspects of this problem: detectability, i.e. the ability to estimate the rank $r$ reliably from the fewest possible random entries, and performance in achieving small reconstruction error. We propose a spectral algorithm for these two tasks called MaCBetH (for Matrix Completion with the Bethe Hessian). The rank is estimated as the number of negative eigenvalues of the Bethe Hessian matrix, and the corresponding eigenvectors are used as initial condition for the minimization of the discrepancy between the estimated matrix and the revealed entries. We analyze the performance in a random matrix setting using results from the statistical mechanics of the Hopfield neural network, and show in particular that MaCBetH efficiently detects the rank $r$ of a large $n\times m$ matrix from $C(r)r\sqrt{nm}$ entries, where $C(r)$ is a constant close to $1$. We also evaluate the corresponding root-mean-square error empirically and show that MaCBetH compares favorably to other existing approaches.