Clustering by Attention: Leveraging Prior Fitted Transformers for Data Partitioning