repliclust: Synthetic Data for Cluster Analysis
Zellinger, Michael J., Bühlmann, Peter
–arXiv.org Artificial Intelligence
Our approach is based on data set archetypes, high-level geometric descriptions from which the user can create many different data sets, each possessing the desired geometric characteristics. The architecture of our software is modular and object-oriented, decomposing data generation into algorithms for placing cluster centers, sampling cluster shapes, selecting the number of data points for each cluster, and assigning probability distributions to clusters.
arXiv.org Artificial Intelligence
Mar-24-2023
- Country:
- North America
- Canada > Alberta (0.14)
- United States
- New York (0.04)
- California
- Los Angeles County > Pasadena (0.04)
- Alameda County > Berkeley (0.04)
- Europe > Switzerland
- North America
- Genre:
- Research Report (1.00)
- Technology: