Clustering idea for very large datasets