AITopics | Clustering

Collaborating Authors

Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Agglomerative Neural Networks for Multi-view Clustering

Liu, Zhe, Li, Yun, Yao, Lina, Wang, Xianzhi, Nie, Feiping

arXiv.org Machine LearningMay-12-2020

Conventional multi-view clustering methods seek for a view consensus through minimizing the pairwise discrepancy between the consensus and subviews. However, the pairwise comparison cannot portray the inter-view relationship precisely if some of the subviews can be further agglomerated. To address the above challenge, we propose the agglomerative analysis to approximate the optimal consensus view, thereby describing the subview relationship within a view structure. We present Agglomerative Neural Network (ANN) based on Constrained Laplacian Rank to cluster multi-view data directly while avoiding a dedicated postprocessing step (e.g., using K-means). We further extend ANN with learnable data space to handle data of complex scenarios. Our evaluations against several state-of-the-art multi-view clustering approaches on four popular datasets show the promising view-consensus analysis ability of ANN. We further demonstrate ANN's capability in analyzing complex view structures and extensibility in our case study and explain its robustness and effectiveness of data-driven modifications.

artificial intelligence, information, machine learning, (16 more...)

arXiv.org Machine Learning

2005.05556

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
South America > Chile > Arica y Parinacota Region > Arica Province > Arica (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Community Detection Clustering via Gumbel Softmax

Acharya, Deepak Bhaskar, Zhang, Huaming

arXiv.org Machine LearningMay-11-2020

Recently, in many systems such as speech recognition and visual processing, deep learning has been widely implemented. In this research, we are exploring the possibility of using deep learning in community detection among the graph datasets. Graphs have gained growing traction in different fields, including social networks, information graphs, the recommender system, and also life sciences. In this paper, we propose a method of community detection clustering the nodes of various graph datasets. We cluster different category datasets that belong to Affiliation networks, Animal networks, Human contact networks, Human social networks, Miscellaneous networks. The deep learning role in modeling the interaction between nodes in a network allows a revolution in the field of science relevant to graph network analysis. In this paper, we extend the gumbel softmax approach to graph network clustering. The experimental findings on specific graph datasets reveal that the new approach outperforms traditional clustering significantly, which strongly shows the efficacy of deep learning in graph community detection clustering. We do a series of experiments on our graph clustering algorithm, using various datasets: Zachary karate club, Highland Tribe, Train bombing, American Revolution, Dolphins, Zebra, Windsurfers, Les Mis\'erables, Political books.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2005.02372

Country:

North America > United States > California (0.14)
Europe > Spain > Galicia > Madrid (0.05)
Oceania > New Zealand (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (0.57)
Law Enforcement & Public Safety > Terrorism (0.36)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

A Novel Granular-Based Bi-Clustering Method of Deep Mining the Co-Expressed Genes

Xu, Kaijie, Pedrycz, Witold, Li, Zhiwu, Quan, Yinghui, Nie, Weike

arXiv.org Artificial IntelligenceMay-11-2020

Traditional clustering methods are limited when dealing with huge and heterogeneous groups of gene expression data, which motivates the development of bi-clustering methods. Bi-clustering methods are used to mine bi-clusters whose subsets of samples (genes) are co-regulated under their test conditions. Studies show that mining bi-clusters of consistent trends and trends with similar degrees of fluctuations from the gene expression data is essential in bioinformatics research. Unfortunately, traditional bi-clustering methods are not fully effective in discovering such bi-clusters. Therefore, we propose a novel bi-clustering method by involving here the theory of Granular Computing. In the proposed scheme, the gene data matrix, considered as a group of time series, is transformed into a series of ordered information granules. With the information granules we build a characteristic matrix of the gene data to capture the fluctuation trend of the expression value between consecutive conditions to mine the ideal bi-clusters. The experimental results are in agreement with the theoretical analysis, and show the excellent performance of the proposed method.

artificial intelligence, information granule, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2005.05519

Country:

Asia > Macao (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

New Ideas for Brain Modelling 6

Greer, Kieran

arXiv.org Artificial IntelligenceMay-11-2020

This paper describes implementation details for a 3-level cognitive model, described in the paper series. The whole architecture is now modular, with different levels using different types of information. The ensemble-hierarchy relationship is maintained and placed in the bottom optimising and middle aggregating levels, to store memory objects and their relations. The top-level cognitive layer has been re-designed to model the Cognitive Process Language (CPL) of an earlier paper, by refactoring it into a network structure with a light scheduler. The cortex brain region is thought to be hierarchical - clustering from simple to more complex features. The refactored network might therefore challenge conventional thinking on that brain region. It is also argued that the function and structure in particular, of the new top level, is similar to the psychology theory of chunking. The model is still only a framework and does not have enough information for real intelligence. But a framework is now implemented over the whole design and so can give a more complete picture about the potential for results.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.3934/biophy.2020022.

2005.05137

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Switzerland (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Improving The Performance Of The K-means Algorithm

Nguyen, Tien-Dung

arXiv.org Machine LearningMay-10-2020

The Incremental K-means (IKM), an improved version of K-means (KM), was introduced to improve the clustering quality of KM significantly. However, the speed of IKM is slower than KM. My thesis proposes two algorithms to speed up IKM while remaining the quality of its clustering result approximately. The first algorithm, called Divisive K-means, improves the speed of IKM by speeding up its splitting process of clusters. Testing with UCI Machine Learning data sets, the new algorithm achieves the empirically global optimum as IKM and has lower complexity, $O(k*log_{2}k*n)$, than IKM, $O(k^{2}n)$. The second algorithm, called Parallel Two-Phase K-means (Par2PK-means), parallelizes IKM by employing the model of Two-Phase K-means. Testing with large data sets, this algorithm attains a good speedup ratio, closing to the linearly speed-up ratio.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2005.04689

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Orange County > Irvine (0.14)
(4 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)

Add feedback

Comparison and Benchmark of Graph Clustering Algorithms

Shi, Lizhen, Chen, Bo

arXiv.org Machine LearningMay-10-2020

Graph clustering is widely used in analysis of biological networks, social networks and etc. For over a decade many graph clustering algorithms have been published, however a comprehensive and consistent performance comparison is not available. In this paper we benchmarked more than 70 graph clustering programs to evaluate their runtime and quality performance for both weighted and unweighted graphs. We also analyzed the characteristics of ground truth that affects the performance. Our work is capable to not only supply a start point for engineers to select clustering algorithms but also could provide a viewpoint for researchers to design new algorithms.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

2005.04806

Country:

North America > United States > Florida > Leon County > Tallahassee (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Data Science & Machine Learning For Non Technical Executives

#artificialintelligenceMay-8-2020, 11:25:46 GMT

Udemy Course Data Science & Machine Learning For Non Technical Executives NED Data Science & Machine Learning For Non Technical Executives free download also includes 8 hours on-demand video, 3 articles, 34 downloadable resources, Full lifetime access by Ankit Mistry Basic idea bout Machine learning technology Different ML algorithm like Regression, Classification & Clustering KNN and Logistic Regression algorithm Linear and Multiple Regression K means Clustering algorithm Overview about Deep Learning, Computer Vision Field Description Welcome to course on Data Science & Machine Learning For Non Technical Executives. Disclaimer: This is not python based machine learning course. I would highly suggest you not to enroll in this course if you are interested in implementation part of machine learning algorithm. There are many course on Udemy which teach machine learning with R/Python. I have designed this course for absolute beginner and non technical people who just want to start diving into machine learning world.

artificial intelligence, machine learning, regression, (12 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.75)

Add feedback

Is an Affine Constraint Needed for Affine Subspace Clustering?

You, Chong, Li, Chun-Guang, Robinson, Daniel P., Vidal, Rene

arXiv.org Machine LearningMay-8-2020

Subspace clustering methods based on expressing each data point as a linear combination of other data points have achieved great success in computer vision applications such as motion segmentation, face and digit clustering. In face clustering, the subspaces are linear and subspace clustering methods can be applied directly. In motion segmentation, the subspaces are affine and an additional affine constraint on the coefficients is often enforced. However, since affine subspaces can always be embedded into linear subspaces of one extra dimension, it is unclear if the affine constraint is really necessary. This paper shows, both theoretically and empirically, that when the dimension of the ambient space is high relative to the sum of the dimensions of the affine subspaces, the affine constraint has a negligible effect on clustering performance. Specifically, our analysis provides conditions that guarantee the correctness of affine subspace clustering methods both with and without the affine constraint, and shows that these conditions are satisfied for high-dimensional data. Underlying our analysis is the notion of affinely independent subspaces, which not only provides geometrically interpretable correctness conditions, but also clarifies the relationships between existing results for affine subspace clustering.

artificial intelligence, machine learning, subspace, (16 more...)

arXiv.org Machine Learning

2005.03888

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Montana (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.74)

Add feedback

Federated learning with hierarchical clustering of local updates to improve training on non-IID data

Briggs, Christopher, Fan, Zhong, Andras, Peter

arXiv.org Machine LearningMay-6-2020

Federated learning (FL) is a well established method for performing machine learning tasks over massively distributed data. However in settings where data is distributed in a non-iid (not independent and identically distributed) fashion -- as is typical in real world situations -- the joint model produced by FL suffers in terms of test set accuracy and/or communication costs compared to training on iid data. We show that learning a single joint model is often not optimal in the presence of certain types of non-iid data. In this work we present a modification to FL by introducing a hierarchical clustering step (FL+HC) to separate clusters of clients by the similarity of their local updates to the global joint model. Once separated, the clusters are trained independently and in parallel on specialised models. We present a robust empirical analysis of the hyperparameters for FL+HC for several iid and non-iid settings. We show how FL+HC allows model training to converge in fewer communication rounds (significantly so under some non-iid settings) compared to FL without clustering. Additionally, FL+HC allows for a greater percentage of clients to reach a target accuracy compared to standard FL. Finally we make suggestions for good default hyperparameters to promote superior performing specialised models without modifying the the underlying federated learning communication protocol.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2004.11791

Country:

Europe > United Kingdom > England > Staffordshire > Keele (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Deep Divergence Learning

Cilingir, Kubra, Manzelli, Rachel, Kulis, Brian

arXiv.org Machine LearningMay-6-2020

These methods, known as Mahalanobis metric learning approaches, have been analyzed Classical linear metric learning methods have recently theoretically, are scalable, and usually involve convex optimization been extended along two distinct lines: problems that can be solved globally (Kulis, 2013; deep metric learning methods for learning embeddings Bellet et al., 2015). of the data using neural networks, and Classical metric learning methods have been extended along Bregman divergence learning approaches for extending various axes; two important directions are deep metric learning learning Euclidean distances to more general and Bregman divergence learning. Deep metric learning divergence measures such as divergences over approaches replace the linear mapping learned in Mahalanobis distributions. In this paper, we introduce deep metric learning methods with more general mappings Bregman divergences, which are based on learning that are learned via neural networks (Hoffer & Ailon, and parameterizing functional Bregman divergences 2015; Chopra et al., 2005). On the other hand, Bregman using neural networks, and which unify divergence methods replace the squared Euclidean distance and extend these existing lines of work. We show with arbitrary Bregman divergences (Bregman, 1967), and in particular how deep metric learning formulations, learn the underlying generating function of the Bregman kernel metric learning, Mahalanobis metric divergence via piecewise linear approximators (Siahkamari learning, and moment-matching functions for et al., 2019) or convex combinations of existing basis functions comparing distributions arise as special cases of (Wu et al., 2009).

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2005.02612

Country:

North America > United States > Rocky Mountains (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Rocky Mountains (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback