Goto

Collaborating Authors

 Clustering


Deep Discriminative Clustering Analysis

arXiv.org Machine Learning

Traditional clustering methods often perform clustering with low-level indiscriminative representations and ignore relationships between patterns, resulting in slight achievements in the era of deep learning. To handle this problem, we develop Deep Discriminative Clustering (DDC) that models the clustering task by investigating relationships between patterns with a deep neural network. Technically, a global constraint is introduced to adaptively estimate the relationships, and a local constraint is developed to endow the network with the capability of learning high-level discriminative representations. By iteratively training the network and estimating the relationships in a mini-batch manner, DDC theoretically converges and the trained network enables to generate a group of discriminative representations that can be treated as clustering centers for straightway clustering. Extensive experiments strongly demonstrate that DDC outperforms current methods on eight image, text and audio datasets concurrently.


A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications

arXiv.org Machine Learning

This survey samples from the ever-growing family of adaptive resonance theory (ART) neural network models used to perform the three primary machine learning modalities, namely, unsupervised, supervised and reinforcement learning. It comprises a representative list from classic to modern ART models, thereby painting a general picture of the architectures developed by researchers over the past 30 years. The learning dynamics of these ART models are briefly described, and their distinctive characteristics such as code representation, long-term memory and corresponding geometric interpretation are discussed. Useful engineering properties of ART (speed, configurability, explainability, parallelization and hardware implementation) are examined along with current challenges. Finally, a compilation of online software libraries is provided. It is expected that this overview will be helpful to new and seasoned ART researchers.


Restricted Connection Orthogonal Matching Pursuit For Sparse Subspace Clustering

arXiv.org Machine Learning

Sparse Subspace Clustering (SSC) is one of the most popular methods for clustering data points into their underlying subspaces. However, SSC may suffer from heavy computational burden. Orthogonal Matching Pursuit applied on SSC accelerates the computation but the trade-off is the loss of clustering accuracy. In this paper, we propose a noise-robust algorithm, Restricted Connection Orthogonal Matching Pursuit for Sparse Subspace Clustering (RCOMP-SSC), to improve the clustering accuracy and maintain the low computational time by restricting the number of connections of each data point during the iteration of OMP. Also, we develop a framework of control matrix to realize RCOMP-SCC. And the framework is scalable for other data point selection strategies. Our analysis and experiments on synthetic data and two real-world databases (EYaleB & Usps) demonstrate the superiority of our algorithm compared with other clustering methods in terms of accuracy and computational time.


Factor Analysis in Fault Diagnostics Using Random Forest

arXiv.org Machine Learning

Factor analysis or sometimes referred to as variable analysis has been extensively used in classification problems for identifying specific factors that are significant to particular classes. This type of analysis has been widely used in application such as customer segmentation, medical research, network traffic, image, and video classification. Today, factor analysis is prominently being used in fault diagnosis of machines to identify the significant factors and to study the root cause of a specific machine fault. The advantage of performing factor analysis in machine maintenance is to perform prescriptive analysis (helps answer what actions to take?) and preemptive analysis (helps answer how to eliminate the failure mode?). In this paper, a real case of an industrial rotating machine was considered where vibration and ambient temperature data was collected for monitoring the health of the machine. Gaussian mixture model-based clustering was used to cluster the data into significant groups, and spectrum analysis was used to diagnose each cluster to a specific state of the machine. The significant features that attribute to a particular mode of the machine were identified by using the random forest classification model. The significant features for specific modes of the machine were used to conclude that the clusters generated are distinct and have a unique set of significant features.


Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator

arXiv.org Machine Learning

Sensors which use electromagnetic induction (EMI) to excite a response in conducting bodies have long been investigated for subsurface explosive hazard detection. In particular, EMI sensors have been used to discriminate between different types of objects, and to detect objects with low metal content. One successful, previously investigated approach is the Multiple Instance Adaptive Cosine Estimator (MI-ACE). In this paper, a number of new initialization techniques for MI-ACE are proposed and evaluated using their respective performance and speed. The cross validated learned signatures, as well as learned background statistics, are used with Adaptive Cosine Estimator (ACE) to generate confidence maps, which are clustered into alarms. Alarms are scored against a ground truth and the initialization approaches are compared.


Deep Spectral Clustering using Dual Autoencoder Network

arXiv.org Machine Learning

The clustering methods have recently absorbed even-increasing attention in learning and vision. Deep clustering combines embedding and clustering together to obtain optimal embedding subspace for clustering, which can be more effective compared with conventional clustering methods. In this paper, we propose a joint learning framework for discriminative embedding and spectral clustering. We first devise a dual autoencoder network, which enforces the reconstruction constraint for the latent representations and their noisy versions, to embed the inputs into a latent space for clustering. As such the learned latent representations can be more robust to noise. Then the mutual information estimation is utilized to provide more discriminative information from the inputs. Furthermore, a deep spectral clustering method is applied to embed the latent representations into the eigenspace and subsequently clusters them, which can fully exploit the relationship between inputs to achieve optimal clustering results. Experimental results on benchmark datasets show that our method can significantly outperform state-of-the-art clustering approaches.


Optimal Clustering Framework for Hyperspectral Band Selection

arXiv.org Machine Learning

Band selection, by choosing a set of representative bands in hyperspectral image (HSI), is an effective method to reduce the redundant information without compromising the original contents. Recently, various unsupervised band selection methods have been proposed, but most of them are based on approximation algorithms which can only obtain suboptimal solutions toward a specific objective function. This paper focuses on clustering-based band selection, and proposes a new framework to solve the above dilemma, claiming the following contributions: 1) An optimal clustering framework (OCF), which can obtain the optimal clustering result for a particular form of objective function under a reasonable constraint. 2) A rank on clusters strategy (RCS), which provides an effective criterion to select bands on existing clustering structure. 3) An automatic method to determine the number of the required bands, which can better evaluate the distinctive information produced by certain number of bands. In experiments, the proposed algorithm is compared to some state-of-the-art competitors. According to the experimental results, the proposed algorithm is robust and significantly outperform the other methods on various data sets.


Soft edit distance for differentiable comparison of symbolic sequences

arXiv.org Machine Learning

Edit distance, also known as Levenshtein distance, is an essential way to compare two strings that proved to be particularly useful in the analysis of genetic sequences and natural language processing. However, edit distance is a discrete function that is known to be hard to optimize. This fact hampers the use of this metric in Machine Learning. Even as simple algorithm as K-means fails to cluster a set of sequences using edit distance if they are of variable length and abundance. In this paper we propose a novel metric - soft edit distance (SED), which is a smooth approximation of edit distance. It is differentiable and therefore it is possible to optimize it with gradient methods. Similar to original edit distance, SED as well as its derivatives can be calculated with recurrent formulas at polynomial time. We prove usefulness of the proposed metric on synthetic datasets and clustering of biological sequences.


Clustering Optimization: Finding the Number and Centroids of Clusters by a Fourier-based Algorithm

arXiv.org Machine Learning

We propose a Fourier-based approach for optimization of several clustering algorithms. Mathematically, clusters data can be described by a density function represented by the Dirac mixture distribution. The density function can be smoothed by applying the Fourier transform and a Gaussian filter. The determination of the optimal standard deviation of the Gaussian filter will be accomplished by the use of a convergence criterion related to the correlation between the smoothed and the original density functions. In principle, the optimal smoothed density function exhibits local maxima, which correspond to the cluster centroids. Thus, the complex task of finding the centroids of the clusters is simplified by the detection of the peaks of the smoothed density function. A multiple sliding windows procedure is used to detect the peaks. The remarkable accuracy of the proposed algorithm demonstrates its capability as a reliable general method for enhancement of the clustering performance, its global optimization and also removing the initialization problem in many clustering methods.


Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization

arXiv.org Machine Learning

In supervised classification, we need a vast amount of labeled training data to train our classifiers. However, it is often not easy to obtain labels due to high labeling costs [Chapelle et al., 2010], privacy concern [Warner, 1965], social bias [Nederhof, 1985], and difficulty to label data. For such reasons, there is a situation in real-world classification problems, where pairwise similarities (i.e., pairs of samples in the same class) and pairwise dissimilarities (i.e., pairs of samples in different classes) might be easier to collect than fully labeled data. For example, in the task of protein function prediction [Klein et al., 2002], the knowledge about similarities/dissimilarities can be obtained as additional supervision, which can be found by experimental means. To handle such pairwise information, similar-unlabeled (SU) classification [Bao et al., 2018] has been proposed, where the classification risk is estimated in an unbiased fashion from only similar pairs and unlabeled data. Although they assumed that only similar pairs and unlabeled data are available, we may also obtain dissimilar pairs in practice. In this case, a method which can handle all of similarities/dissimilarities and unlabeled data is desirable. Semi-supervised clustering [Wagstaff et al., 2001] is one of the methods that can handle both similar and dissimilar pairs, where must-link pairs (i.e., similar pairs) and cannot-link pairs (i.e., dissimilar pairs) are used to obtain meaningful clusters.