Data Science
Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
Belkin, Mikhail, Niyogi, Partha
Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering.
Grouping and dimensionality reduction by locally linear embedding
Polito, Marzia, Perona, Pietro
Locally Linear Embedding (LLE) is an elegant nonlinear dimensionality-reduction technique recently introduced by Roweis and Saul [2]. It fails when the data is divided into separate groups. We study a variant of LLE that can simultaneously group the data and calculate local embedding of each group. An estimate for the upper bound on the intrinsic dimension of the data set is obtained automatically. 1 Introduction
Blind Source Separation via Multinode Sparse Representation
Zibulevsky, Michael, Kisilev, Pavel, Zeevi, Yehoshua Y., Pearlmutter, Barak A.
We consider a problem of blind source separation from a set of instantaneous linearmixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representationaccording to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified onnoise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation qualityover previously reported results. 1 Introduction
Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra
Hayton, Paul M., Schölkopf, Bernhard, Tarassenko, Lionel, Anuzis, Paul
A system has been developed to extract diagnostic information from jet engine carcass vibration data. Support Vector Machines applied to novelty detection provide a measure of how unusual the shape of a vibration signature is, by learning a representation of normality. We describe a novel method for Support Vector Machines of including information from a second class for novelty detection and give results from the application to Jet Engine vibration analysis.
A Linear Programming Approach to Novelty Detection
Campbell, Colin, Bennett, Kristin P.
Novelty detection involves modeling the normal behaviour of a system hence enabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damage or highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice - training involves solution of a quadratic programming problem. In this paper we propose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets.
A Linear Programming Approach to Novelty Detection
Campbell, Colin, Bennett, Kristin P.
Novelty detection involves modeling the normal behaviour of a system henceenabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damageor highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice - training involves solution of a quadratic programming problem. In this paper wepropose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets.