Image Clustering Conditioned on Text Criteria

التفاصيل البيبلوغرافية
العنوان: Image Clustering Conditioned on Text Criteria
المؤلفون: Kwon, Sehyun, Park, Jaeseung, Kim, Minkyu, Cho, Jaewoong, Ryu, Ernest K., Lee, Kangwook
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
الوصف: Classical clustering methods do not provide users with direct control of the clustering results, and the clustering results may not be consistent with the relevant criterion that a user has in mind. In this work, we present a new methodology for performing image clustering based on user-specified text criteria by leveraging modern vision-language models and large language models. We call our method Image Clustering Conditioned on Text Criteria (IC|TC), and it represents a different paradigm of image clustering. IC|TC requires a minimal and practical degree of human intervention and grants the user significant control over the clustering results in return. Our experiments show that IC|TC can effectively cluster images with various criteria, such as human action, physical location, or the person's mood, while significantly outperforming baselines.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2310.18297
رقم الأكسشن: edsarx.2310.18297
قاعدة البيانات: arXiv