Goto

Collaborating Authors

 Statistical Learning


Multiscale Geometric Methods for Data Sets II: Geometric Multi-Resolution Analysis

arXiv.org Machine Learning

Data sets are often modeled as point clouds in $R^D$, for $D$ large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a $d$-dimensional manifold $M$, with $d$ much smaller than $D$. When $M$ is simply a linear subspace, one may exploit this assumption for encoding efficiently the data by projecting onto a dictionary of $d$ vectors in $R^D$ (for example found by SVD), at a cost $(n+D)d$ for $n$ data points. When $M$ is nonlinear, there are no "explicit" constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box optimization. In this paper we construct data-dependent multi-scale dictionaries that aim at efficient encoding and manipulating of the data. Their construction is fast, and so are the algorithms that map data points to dictionary coefficients and vice versa. In addition, data points are guaranteed to have a sparse representation in terms of the dictionary. We think of dictionaries as the analogue of wavelets, but for approximating point clouds rather than functions.


Sparse Volterra and Polynomial Regression Models: Recoverability and Estimation

arXiv.org Machine Learning

Volterra and polynomial regression models play a major role in nonlinear system identification and inference tasks. Exciting applications ranging from neuroscience to genome-wide association analysis build on these models with the additional requirement of parsimony. This requirement has high interpretative value, but unfortunately cannot be met by least-squares based or kernel regression methods. To this end, compressed sampling (CS) approaches, already successful in linear regression settings, can offer a viable alternative. The viability of CS for sparse Volterra and polynomial models is the core theme of this work. A common sparse regression task is initially posed for the two models. Building on (weighted) Lasso-based schemes, an adaptive RLS-type algorithm is developed for sparse polynomial regressions. The identifiability of polynomial models is critically challenged by dimensionality. However, following the CS principle, when these models are sparse, they could be recovered by far fewer measurements. To quantify the sufficient number of measurements for a given level of sparsity, restricted isometry properties (RIP) are investigated in commonly met polynomial regression settings, generalizing known results for their linear counterparts. The merits of the novel (weighted) adaptive CS algorithms to sparse polynomial modeling are verified through synthetic as well as real data tests for genotype-phenotype analysis.


A Combinatorial Optimisation Approach to Designing Dual-Parented Long-Reach Passive Optical Networks

arXiv.org Artificial Intelligence

We present an application focused on the design of resilient long-reach passive optical networks. We specifically consider dual-parented networks whereby each customer must be connected to two metro sites via local exchange sites. An important property of such a placement is resilience to single metro node failure. The objective of the application is to determine the optimal position of a set of metro nodes such that the total optical fibre length is minimized. We prove that this problem is NP-Complete. We present two alternative combinatorial optimisation approaches to finding an optimal metro node placement using: a mixed integer linear programming (MIP) formulation of the problem; and, a hybrid approach that uses clustering as a preprocessing step. We consider a detailed case-study based on a network for Ireland. The hybrid approach scales well and finds solutions that are close to optimal, with a runtime that is two orders-of-magnitude better than the MIP model.


Color Texture Classification Approach Based on Combination of Primitive Pattern Units and Statistical Features

arXiv.org Artificial Intelligence

Texture classification became one of the problems which has been paid much attention on by image processing scientists since late 80s. Consequently, since now many different methods have been proposed to solve this problem. In most of these methods the researchers attempted to describe and discriminate textures based on linear and non-linear patterns. The linear and non-linear patterns on any window are based on formation of Grain Components in a particular order. Grain component is a primitive unit of morphology that most meaningful information often appears in the form of occurrence of that. The approach which is proposed in this paper could analyze the texture based on its grain components and then by making grain components histogram and extracting statistical features from that would classify the textures. Finally, to increase the accuracy of classification, proposed approach is expanded to color images to utilize the ability of approach in analyzing each RGB channels, individually. Although, this approach is a general one and it could be used in different applications, the method has been tested on the stone texture and the results can prove the quality of approach.


Robust Kernel Density Estimation

arXiv.org Machine Learning

We propose a method for nonparametric density estimation that exhibits robustness to contamination of the training sample. This method achieves robustness by combining a traditional kernel density estimator (KDE) with ideas from classical $M$-estimation. We interpret the KDE based on a radial, positive semi-definite kernel as a sample mean in the associated reproducing kernel Hilbert space. Since the sample mean is sensitive to outliers, we estimate it robustly via $M$-estimation, yielding a robust kernel density estimator (RKDE). An RKDE can be computed efficiently via a kernelized iteratively re-weighted least squares (IRWLS) algorithm. Necessary and sufficient conditions are given for kernelized IRWLS to converge to the global minimizer of the $M$-estimator objective function. The robustness of the RKDE is demonstrated with a representer theorem, the influence function, and experimental results for density estimation and anomaly detection.


ShareBoost: Efficient Multiclass Learning with Feature Sharing

arXiv.org Artificial Intelligence

Multiclass prediction is the problem of classifying an object into a relevant target class. We consider the problem of learning a multiclass predictor that uses only few features, and in particular, the number of used features should increase sub-linearly with the number of possible classes. This implies that features should be shared by several classes. We describe and analyze the ShareBoost algorithm for learning a multiclass predictor that uses few shared features. We prove that ShareBoost efficiently finds a predictor that uses few shared features (if such a predictor exists) and that it has a small generalization error. We also describe how to use ShareBoost for learning a non-linear predictor that has a fast evaluation time. In a series of experiments with natural data sets we demonstrate the benefits of ShareBoost and evaluate its success relatively to other state-of-the-art approaches.


Variable Selection in High Dimensions with Random Designs and Orthogonal Matching Pursuit

arXiv.org Machine Learning

The performance of Orthogonal Matching Pursuit (OMP) for variable selection is analyzed for random designs. When contrasted with the deterministic case, since the performance is here measured after averaging over the distribution of the design matrix, one can have far less stringent sparsity constraints on the coefficient vector. We demonstrate that for exact sparse vectors, the performance of the OMP is similar to known results on the Lasso algorithm [\textit{IEEE Trans. Inform. Theory} \textbf{55} (2009) 2183--2202]. Moreover, variable selection under a more relaxed sparsity assumption on the coefficient vector, whereby one has only control on the $\ell_1$ norm of the smaller coefficients, is also analyzed. As a consequence of these results, we also show that the coefficient estimate satisfies strong oracle type inequalities.


Gradient-based kernel dimension reduction for supervised learning

arXiv.org Machine Learning

This paper proposes a novel kernel approach to linear dimension reduction for supervised learning. The purpose of the dimension reduction is to find directions in the input space to explain the output as effectively as possible. The proposed method uses an estimator for the gradient of regression function, based on the covariance operators on reproducing kernel Hilbert spaces. In comparison with other existing methods, the proposed one has wide applicability without strong assumptions on the distributions or the type of variables, and uses computationally simple eigendecomposition. Experimental results show that the proposed method successfully finds the effective directions with efficient computation.


Bayesian nonparametric multivariate convex regression

arXiv.org Machine Learning

X, where f(x) is the gradient of f at x. This is called the convex regression problem. Convex regression can easily be modified to allow concave regression by multiplying all of the values by negative one. Convex regression problems are common in economics, operations research and reinforcement learning. In economics, production functions (Skiba 1978) and consumer preferences (Meyer & Pratt 1968) are often convex, while in operations research and reinforcement learning, value functions for stochastic optimization problems can be convex (Shapiro et al. 2009). If a problem is known to be convex, a convex regression estimate provides advantages over an unrestricted estimate. First, convexity is a powerful regularizer: it places strong conditions on the derivatives--and hence smoothness--of a function. Convexity constraints can substantially reduce overfitting and lead to more accurate predictions. Second, maintaining convexity allows the use of convex optimization solvers when the regression estimate is used in an objective function of an optimization problem. 1 Multivariate convex regression has received relatively little attention in the literature.


Nonlinear Channel Estimation for OFDM System by Complex LS-SVM under High Mobility Conditions

arXiv.org Machine Learning

A nonlinear channel estimator using complex Least Square Support Vector Machines (LS-SVM) is proposed for pilot-aided OFDM system and applied to Long Term Evolution (LTE) downlink under high mobility conditions. The estimation algorithm makes use of the reference signals to estimate the total frequency response of the highly selective multipath channel in the presence of non-Gaussian impulse noise interfering with pilot signals. Thus, the algorithm maps trained data into a high dimensional feature space and uses the structural risk minimization (SRM) principle to carry out the regression estimation for the frequency response function of the highly selective channel. The simulations show the effectiveness of the proposed method which has good performance and high precision to track the variations of the fading channels compared to the conventional LS method and it is robust at high speed mobility.