K-tree: Large Scale Document Clustering
De Vries, Christopher M., Geva, Shlomo
–arXiv.org Artificial Intelligence
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
arXiv.org Artificial Intelligence
Jan-6-2010
- Country:
- North America > United States
- Massachusetts (0.15)
- Oceania > Australia
- Queensland (0.16)
- North America > United States
- Technology: