AITopics | steinley

Collaborating Authors

steinley

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Minimum adjusted Rand index for two clusterings of a given size

Chacón, José E., Rastrojo, Ana I.

arXiv.org Machine LearningFeb-10-2020

The adjusted Rand index is one of the most commonly used similarity measures to compare two clusterings of a given set of objects. Indeed, it is the recommended criterion for external clustering evaluation in the seminal study of Milligan and Cooper (1986). Nevertheless, many other measures for external clustering evaluation were recently surveyed in Meilă (2016). Initially, Rand (1971) considered a similarity index between two clusterings (the Rand index) defined as the proportion of object pairs that are either assigned to the same cluster in both clusterings or to different clusters in both clusterings. However, Morey and Agresti (1984) noted that such an index does not take into account the possible agreement by chance, and Hubert and Arabie (1985) introduced a corrected-for-chance version of the Rand index, which is usually known as the adjusted Rand index (ARI).

agreement, rand index, steinley, (16 more...)

arXiv.org Machine Learning

2002.03677

Country:

Europe > Spain > Extremadura (0.05)
North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback

Explicit agreement extremes for a $2\times2$ table with given marginals

Chacón, José E.

arXiv.org Machine LearningJan-21-2020

Given two different clusterings of a data set, many measures ha ve been proposed to quantify their degree of concordance. A recent review of a representa tive number of them can be found in Meil a (2016). These measures are usually categori zed into three classes: those based on inspecting the assignments of data pairs in both clu sterings, those involving some cluster matching between the two clusterings, and those rel ying on information theoretic criteria. This paper concerns the first one of these classes. In fact, some of the most popular and widely used similarity measures, such as the Rand ind ex, the Jaccard index, or the Fowlkes-Mallows index, belong to this class of pair-based s imilarities, but it should be noted that there is a plethora of them, as explored in Albatineh, Niewiadomska-Bugaj and Mihalko (2006), Warrens (2008) or Warrens and van der Hoef (2019).

agreement, configuration, steinley, (14 more...)

arXiv.org Machine Learning

2001.07415

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Spain > Extremadura (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.92)

Add feedback

A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation

Chacón, José E.

arXiv.org Machine LearningJul-26-2019

Indeed, it was the recommended choice in the seminal paper of Milligan and Cooper (1986), where five criteria were examined regarding the task of comparison of hierarchical clustering algorithms across different hierarchy levels. Their recommendation is based on the fact that, for the null case data (i.e., for a synthetic sample with randomly assigned class labels, showing no significant cluster structure), the ARI was the only index that produced a flat response curve across hierarchy levels, with mean values close to zero, hence indicating that the agreement between the randomly assigned labels and the algorithm solution was due to chance. Another popular measure for clustering validation, not included in Milligan and Cooper's study, is the misclassification error distance (MED). Its first appearance in the literature dates back at least to R egnier (1965), where it was introduced as a distance between partitions of a finite set, and it was called transfer distance. It is also referred to as partition distance (Gusfield, 2002) or maximum matching distance (Rossi, 2015).

confusion matrix, matrix, partition, (16 more...)

arXiv.org Machine Learning

1907.11505

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
Europe > Spain > Extremadura (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Strength High (0.54)
Research Report > Experimental Study (0.54)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback