KNN and K-means in Gini Prametric Spaces
Mussard, Cassandra, Charpentier, Arthur, Mussard, Stéphane
–arXiv.org Artificial Intelligence
This paper introduces innovative enhancements to the K-means and K-nearest neighbors (KNN) algorithms based on the concept of Gini prametric spaces. Unlike traditional distance metrics, Gini-based measures incorporate both value-based and rank-based information, improving robustness to noise and outliers. The main contributions of this work include: proposing a Gini-based measure that captures both rank information and value distances; presenting a Gini K-means algorithm that is proven to converge and demonstrates resilience to noisy data; and introducing a Gini KNN method that performs competitively with state-of-the-art approaches such as Hassanat's distance in noisy environments. Experimental evaluations on 14 datasets from the UCI repository demonstrate the superior performance and efficiency of Gini-based algorithms in clustering and classification tasks. This work opens new avenues for leveraging rank-based measures in machine learning and statistical analysis.
arXiv.org Artificial Intelligence
Jan-29-2025
- Country:
- Europe > France
- Occitanie > Haute-Garonne > Toulouse (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California
- Los Angeles County > Vincent (0.04)
- San Mateo County > Menlo Park (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Texas (0.04)
- California
- Canada > Quebec
- Oceania > Australia
- Australian Capital Territory > Canberra (0.06)
- Europe > France
- Genre:
- Research Report > Promising Solution (0.34)
- Technology: