How to cluster dataset with high dimensionality and mixed datatypes

Nov-25-2019, 09:34:37 GMT–#artificialintelligence

When it comes to cluster analysis for retail and e-commerce customer data, more often than not, you will find the dataset messy, high dimensional and with many categorical variables. Although there are many dimensional reduction techniques, most of them do not work well with the dataset with many categorical variables. Traditionally, clustering approaches suffer when features are not clean numeric values. For example, the most popular algorithm KNN can only handle numeric variables. Generalized low rank models (GLRMs), developed by students at Stanford University (see Udell '16) -- propose a new clustering framework to handle all types of data even with mixed datatypes.

categorical variable, dataset, high dimensionality and mixed datatype, (7 more...)

#artificialintelligence

Nov-25-2019, 09:34:37 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.38)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found