Mixing Natural and Synthetic Images for Robust Self-Supervised Representations

التفاصيل البيبلوغرافية
العنوان: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations
المؤلفون: Bafghi, Reza Akbarian, Harilal, Nidhin, Monteleoni, Claire, Raissi, Maziar
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: This paper introduces DiffMix, a new self-supervised learning (SSL) pre-training framework that combines real and synthetic images. Unlike traditional SSL methods that predominantly use real images, DiffMix uses a variant of Stable Diffusion to replace an augmented instance of a real image, facilitating the learning of cross real-synthetic image representations. The key insight is that while SSL methods trained solely on synthetic images underperform compared to those trained on real images, a blended training approach using both real and synthetic images leads to more robust and adaptable representations. Experiments demonstrate that DiffMix enhances the SSL methods SimCLR, BarlowTwins, and DINO, across various robustness datasets and domain transfer tasks. DiffMix boosts SimCLR's accuracy on ImageNet-1K by 4.56\%. These results challenge the notion that high-quality real images are crucial for SSL pre-training by showing that lower quality synthetic images can also produce strong representations. DiffMix also reduces the need for image augmentations in SSL, offering new optimization strategies.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2406.12368
رقم الأكسشن: edsarx.2406.12368
قاعدة البيانات: arXiv