SVGDreamer: Text Guided SVG Generation with Diffusion Model

التفاصيل البيبلوغرافية
العنوان: SVGDreamer: Text Guided SVG Generation with Diffusion Model
المؤلفون: Xing, Ximing, Zhou, Haitao, Wang, Chuang, Zhang, Jing, Xu, Dong, Yu, Qian
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
الوصف: Recently, text-guided scalable vector graphics (SVGs) synthesis has shown promise in domains such as iconography and sketch. However, existing text-to-SVG generation methods lack editability and struggle with visual quality and result diversity. To address these limitations, we propose a novel text-guided vector graphics synthesis method called SVGDreamer. SVGDreamer incorporates a semantic-driven image vectorization (SIVE) process that enables the decomposition of synthesis into foreground objects and background, thereby enhancing editability. Specifically, the SIVE process introduces attention-based primitive control and an attention-mask loss function for effective control and manipulation of individual elements. Additionally, we propose a Vectorized Particle-based Score Distillation (VPSD) approach to address issues of shape over-smoothing, color over-saturation, limited diversity, and slow convergence of the existing text-to-SVG generation methods by modeling SVGs as distributions of control points and colors. Furthermore, VPSD leverages a reward model to re-weight vector particles, which improves aesthetic appeal and accelerates convergence. Extensive experiments are conducted to validate the effectiveness of SVGDreamer, demonstrating its superiority over baseline methods in terms of editability, visual quality, and diversity. Project page: \href{https://ximinng.github.io/SVGDreamer-project/}{https://ximinng.github.io/SVGDreamer-project/}
Comment: Accepted by CVPR 2024. project link: https://ximinng.github.io/SVGDreamer-project/
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2312.16476
رقم الأكسشن: edsarx.2312.16476
قاعدة البيانات: arXiv