LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment

التفاصيل البيبلوغرافية
العنوان:	LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment
المؤلفون:	Helff, Lukas, Friedrich, Felix, Brack, Manuel, Kersting, Kristian, Schramowski, Patrick
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
الوصف:	We introduce LlavaGuard, a family of VLM-based safeguard models, offering a versatile framework for evaluating the safety compliance of visual content. Specifically, we designed LlavaGuard for dataset annotation and generative model safeguarding. To this end, we collected and annotated a high-quality visual dataset incorporating a broad safety taxonomy, which we use to tune VLMs on context-aware safety risks. As a key innovation, LlavaGuard's new responses contain comprehensive information, including a safety rating, the violated safety categories, and an in-depth rationale. Further, our introduced customizable taxonomy categories enable the context-specific alignment of LlavaGuard to various scenarios. Our experiments highlight the capabilities of LlavaGuard in complex and real-world applications. We provide checkpoints ranging from 7B to 34B parameters demonstrating state-of-the-art performance, with even the smallest models outperforming baselines like GPT-4. We make our dataset and model weights publicly available and invite further research to address the diverse needs of communities and contexts. Comment: Project page at https://ml-research.github.io/human-centered-genai/projects/llavaguard/index.html
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2406.05113
رقم الأكسشن:	edsarx.2406.05113
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.