Image Clustering Conditioned on Text Criteria
Kwon, Sehyun, Park, Jaeseung, Kim, Minkyu, Cho, Jaewoong, Ryu, Ernest K., Lee, Kangwook
–arXiv.org Artificial Intelligence
Classical clustering methods do not provide users with direct control of the clustering results, and the clustering results may not be consistent with the relevant criterion that a user has in mind. In this work, we present a new methodology for performing image clustering based on user-specified text criteria by leveraging modern vision-language models and large language models. We call our method Image Clustering Conditioned on Text Criteria (IC|TC), and it represents a different paradigm of image clustering. IC|TC requires a minimal and practical degree of human intervention and grants the user significant control over the clustering results in return. Our experiments show that IC|TC can effectively cluster images with various criteria, such as human action, physical location, or the person's mood, while significantly outperforming baselines.
arXiv.org Artificial Intelligence
Nov-29-2023
- Country:
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Leisure & Entertainment > Sports (0.93)
- Media > Music (1.00)
- Transportation > Ground
- Road (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning > Clustering (1.00)
- Natural Language
- Chatbot (1.00)
- Large Language Model (1.00)
- Vision (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence