تقرير
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
العنوان: | The Chosen One: Consistent Characters in Text-to-Image Diffusion Models |
---|---|
المؤلفون: | Avrahami, Omri, Hertz, Amir, Vinker, Yael, Arar, Moab, Fruchter, Shlomi, Fried, Ohad, Cohen-Or, Daniel, Lischinski, Dani |
سنة النشر: | 2023 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Computer Science - Machine Learning |
الوصف: | Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, the users that use these models struggle with the generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development, asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. Comment: Accepted to SIGGRAPH 2024. Project page is available at https://omriavrahami.com/the-chosen-one/ |
نوع الوثيقة: | Working Paper |
DOI: | 10.1145/3641519.3657430 |
URL الوصول: | http://arxiv.org/abs/2311.10093 |
رقم الأكسشن: | edsarx.2311.10093 |
قاعدة البيانات: | arXiv |
DOI: | 10.1145/3641519.3657430 |
---|