AITopics | rk-ccd

Collaborating Authors

rk-ccd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cluster Catch Digraphs with the Nearest Neighbor Distance

Shi, Rui, Billor, Nedret, Ceyhan, Elvan

arXiv.org Machine LearningJan-9-2025

We introduce a new method for clustering based on Cluster Catch Digraphs (CCDs). The new method addresses the limitations of RK-CCDs by employing a new variant of spatial randomness test that employs the nearest neighbor distance (NND) instead of the Ripley's K function used by RK-CCDs. We conduct a comprehensive Monte Carlo analysis to assess the performance of our method, considering factors such as dimensionality, data set size, number of clusters, cluster volumes, and inter-cluster distance. Our method is particularly effective for high-dimensional data sets, comparable to or outperforming KS-CCDs and RK-CCDs that rely on a KS-type statistic or the Ripley's K function. We also evaluate our methods using real and complex data sets, comparing them to well-known clustering methods. Again, our methods exhibit competitive performance, producing high-quality clusters with desirable properties. Keywords: Graph-based clustering, Cluster catch digraphs, High-dimensional data, The nearest neighbor distance, Spatial randomness test

artificial intelligence, cluster 0, machine learning, (16 more...)

arXiv.org Machine Learning

2501.06268

Country: North America > United States > New York (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Outlier Detection with Cluster Catch Digraphs

Shi, Rui, Billor, Nedret, Ceyhan, Elvan

arXiv.org Machine LearningSep-17-2024

This paper introduces a novel family of outlier detection algorithms based on Cluster Catch Digraphs (CCDs), specifically tailored to address the challenges of high dimensionality and varying cluster shapes, which deteriorate the performance of most traditional outlier detection methods. We propose the Uniformity-Based CCD with Mutual Catch Graph (U-MCCD), the Uniformity- and Neighbor-Based CCD with Mutual Catch Graph (UN-MCCD), and their shape-adaptive variants (SU-MCCD and SUN-MCCD), which are designed to detect outliers in data sets with arbitrary cluster shapes and high dimensions. We present the advantages and shortcomings of these algorithms and provide the motivation or need to define each particular algorithm. Through comprehensive Monte Carlo simulations, we assess their performance and demonstrate the robustness and effectiveness of our algorithms across various settings and contamination levels. We also illustrate the use of our algorithms on various real-life data sets. The U-MCCD algorithm efficiently identifies outliers while maintaining high true negative rates, and the SU-MCCD algorithm shows substantial improvement in handling non-uniform clusters. Additionally, the UN-MCCD and SUN-MCCD algorithms address the limitations of existing methods in high-dimensional spaces by utilizing Nearest Neighbor Distances (NND) for clustering and outlier detection. Our results indicate that these novel algorithms offer substantial advancements in the accuracy and adaptability of outlier detection, providing a valuable tool for various real-world applications. Keyword: Outlier detection, Graph-based clustering, Cluster catch digraphs, $k$-nearest-neighborhood, Mutual catch graphs, Nearest neighbor distance.

algorithm, outlier, simulation, (14 more...)

arXiv.org Machine Learning

2409.11596

Country:

North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
North America > United States > New York > New York County > New York City (0.04)
(12 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Information Technology (0.92)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Parameter Free Clustering with Cluster Catch Digraphs (Technical Report)

Manukyan, Artür, Ceyhan, Elvan

arXiv.org Machine LearningDec-26-2019

Clustering is one of the most challenging tasks in machine learning and pattern recognition, and perhaps, discovering the exact number of clusters of an unlabelled data set is the leading one. Many clustering methods find the clusters (or hidden classes) and the number of these clusters simultaneously (Frey and Dueck, 2007; Sajana et al., 2016). Although there exist methods for validating and comparing the quality of a partitioning of a data set, algorithms that provide the (estimated) number of clusters without any input parameter are still appealing. However, such methods or algorithms rely on other parameters viewed as the intensity, i.e. expected number of objects in a unit area. The value of the intensity parameter works as a threshold, and if the local intensity of the data set exceeds the threshold, it may indicate the existence of a possible cluster. However, the choice of such parameters is often a difficult task since different values of such parameters may drastically change the result of the algorithm. We use unsupervised adaptations of a family of vertex random digraphs, namely class cover catch digraphs (CCCDs), that showed relatively good performance in statistical pattern classification (Manukyan and Ceyhan, 2016; Priebe et al., 2003a). Unsupervised versions of CCCDs are called cluster catch digraphs (CCDs) (DeVinney, 2003; Marchette, 2004). Primarily, CCDs use statistics that require an intensity parameter to be specified or estimated.

algorithm, digraph, rk-ccd, (12 more...)

arXiv.org Machine Learning

1912.11926

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(10 more...)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback