KNN and K-means in Gini Prametric Spaces
Mussard, Cassandra, Charpentier, Arthur, Mussard, Stéphane
–arXiv.org Artificial Intelligence
This paper introduces innovative enhancements to the K-means and K-nearest neighbors (KNN) algorithms based on the concept of Gini prametric spaces. Unlike traditional distance metrics, Gini-based measures incorporate both value-based and rank-based information, improving robustness to noise and outliers. The main contributions of this work include: proposing a Gini-based measure that captures both rank information and value distances; presenting a Gini K-means algorithm that is proven to converge and demonstrates resilience to noisy data; and introducing a Gini KNN method that performs competitively with state-of-the-art approaches such as Hassanat's distance in noisy environments. Experimental evaluations on 14 datasets from the UCI repository demonstrate the superior performance and efficiency of Gini-based algorithms in clustering and classification tasks. This work opens new avenues for leveraging rank-based measures in machine learning and statistical analysis.
arXiv.org Artificial Intelligence
Jan-29-2025
- Country:
- North America > United States > California (0.27)
- Genre:
- Research Report > Promising Solution (0.34)
- Technology: