D$^4$-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On

التفاصيل البيبلوغرافية
العنوان: D$^4$-VTON: Dynamic Semantics Disentangling for Differential Diffusion based Virtual Try-On
المؤلفون: Yang, Zhaotong, Jiang, Zicheng, Li, Xinzhe, Zhou, Huiyu, Dong, Junyu, Zhang, Huaidong, Du, Yong
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: In this paper, we introduce D$^4$-VTON, an innovative solution for image-based virtual try-on. We address challenges from previous studies, such as semantic inconsistencies before and after garment warping, and reliance on static, annotation-driven clothing parsers. Additionally, we tackle the complexities in diffusion-based VTON models when handling simultaneous tasks like inpainting and denoising. Our approach utilizes two key technologies: Firstly, Dynamic Semantics Disentangling Modules (DSDMs) extract abstract semantic information from garments to create distinct local flows, improving precise garment warping in a self-discovered manner. Secondly, by integrating a Differential Information Tracking Path (DITP), we establish a novel diffusion-based VTON paradigm. This path captures differential information between incomplete try-on inputs and their complete versions, enabling the network to handle multiple degradations independently, thereby minimizing learning ambiguities and achieving realistic results with minimal overhead. Extensive experiments demonstrate that D$^4$-VTON significantly outperforms existing methods in both quantitative metrics and qualitative evaluations, demonstrating its capability in generating realistic images and ensuring semantic consistency.
Comment: ECCV2024
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.15111
رقم الأكسشن: edsarx.2407.15111
قاعدة البيانات: arXiv