Mixed Data Clustering Survey and Challenges
Guerard, Guillaume, Djebali, Sonia
–arXiv.org Artificial Intelligence
This paradigm challenges traditional data management and analysis techniques by demanding innovative solutions capable of processing, analyzing, and deriving insights from vast and diverse datasets. In particular, the inclusion of mixed data types, such as numerical and categorical variables, poses significant challenges to conventional methodologies, necessitating the development of novel approaches to effectively leverage the wealth of information available [2]. Traditionally, data handling methods were designed around homogeneous datasets, typically consisting of numerical values. However, the big data paradigm introduces a multitude of data types, including structured, unstructured, and semi-structured data, which demand a departure from traditional approaches. Moreover, the three primary characteristics of big data--volume, velocity, and variety--amplify the complexity of data analysis, requiring scalable and adaptable solutions capable of processing large volumes of data at high speeds while accommodating diverse data formats and structures. These methods for handling mixed data often involve separate analyses of categorical and numerical variables, treating them as distinct entities rather than integrating their interdependencies.
arXiv.org Artificial Intelligence
Dec-4-2025
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (1.00)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Technology: