دورية أكاديمية

GDUI: Guided Diffusion Model for Unlabeled Images

التفاصيل البيبلوغرافية
العنوان: GDUI: Guided Diffusion Model for Unlabeled Images
المؤلفون: Xuanyuan Xie, Jieyu Zhao
المصدر: Algorithms, Vol 17, Iss 3, p 125 (2024)
بيانات النشر: MDPI AG, 2024.
سنة النشر: 2024
المجموعة: LCC:Industrial engineering. Management engineering
LCC:Electronic computers. Computer science
مصطلحات موضوعية: image synthesis, guided diffusion, semantic aware, pseudo-label matching, Industrial engineering. Management engineering, T55.4-60.8, Electronic computers. Computer science, QA75.5-76.95
الوصف: The diffusion model has made progress in the field of image synthesis, especially in the area of conditional image synthesis. However, this improvement is highly dependent on large annotated datasets. To tackle this challenge, we present the Guided Diffusion model for Unlabeled Images (GDUI) framework in this article. It utilizes the inherent feature similarity and semantic differences in the data, as well as the downstream transferability of Contrastive Language-Image Pretraining (CLIP), to guide the diffusion model in generating high-quality images. We design two semantic-aware algorithms, namely, the pseudo-label-matching algorithm and label-matching refinement algorithm, to match the clustering results with the true semantic information and provide more accurate guidance for the diffusion model. First, GDUI encodes the image into a semantically meaningful latent vector through clustering. Then, pseudo-label matching is used to complete the matching of the true semantic information of the image. Finally, the label-matching refinement algorithm is used to adjust the irrelevant semantic information in the data, thereby improving the quality of the guided diffusion model image generation. Our experiments on labeled datasets show that GDUI outperforms diffusion models without any guidance and significantly reduces the gap between it and models guided by ground-truth labels.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1999-4893
Relation: https://www.mdpi.com/1999-4893/17/3/125; https://doaj.org/toc/1999-4893
DOI: 10.3390/a17030125
URL الوصول: https://doaj.org/article/cb9fb7f4bad542e6b45906345061f758
رقم الأكسشن: edsdoj.b9fb7f4bad542e6b45906345061f758
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:19994893
DOI:10.3390/a17030125