K-tree: Large Scale Document Clustering

De Vries, Christopher M., Geva, Shlomo

arXiv.org Artificial Intelligence 

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found