multi-slice clustering
Parallel Computation of Multi-Slice Clustering of Third-Order Tensors
Andriantsiory, Dina Faneva, Coti, Camille, Geloun, Joseph Ben, Lebbah, Mustapha
Machine Learning approaches like clustering methods deal with massive datasets that present an increasing challenge. We devise parallel algorithms to compute the Multi-Slice Clustering (MSC) for 3rd-order tensors. The MSC method is based on spectral analysis of the tensor slices and works independently on each tensor mode. Such features fit well in the parallel paradigm via a distributed memory system. We show that our parallel scheme outperforms sequential computing and allows for the scalability of the MSC method.
DBSCAN of Multi-Slice Clustering for Third-Order Tensors
Andriantsiory, Dina Faneva, Geloun, Joseph Ben, Lebbah, Mustapha
Several methods for triclustering three-dimensional data require as hyperparameters the cluster size set or the number of clusters in each dimension. These methods raise an issue since, for real datasets, those inputs cannot be known without extreme cost. Recently introduced, the Multi-Slice Clustering (MSC) tackles this issue by using a threshold parameter to perform the data clustering. The MSC finds signal slices that lie in a lower dimensional subspace of 3rd-order rank-1 tensor datasets. The present work addresses an extension of this algorithm, namely the MSC-DBSCAN, that extracts several slice clusters that lie in different subspaces, when the 3rd-order dataset is a sum of r 1 rank-1 tensors. Our algorithm uses the same input as the MSC algorithm and reduces to the same cluster solution for rank-1 tensor dataset.
Multi-Slice Clustering for 3-order Tensor Data
Andriantsiory, Dina Faneva, Geloun, Joseph Ben, Lebbah, Mustapha
Several methods of triclustering of three dimensional data require the specification of the cluster size in each dimension. This introduces a certain degree of arbitrariness. To address this issue, we propose a new method, namely the multi-slice clustering (MSC) for a 3-order tensor data set. We analyse, in each dimension or tensor mode, the spectral decomposition of each tensor slice, i.e. a matrix. Thus, we define a similarity measure between matrix slices up to a threshold (precision) parameter, and from that, identify a cluster. The intersection of all partial clusters provides the desired triclustering. The effectiveness of our algorithm is shown on both synthetic and real-world data sets.