Visually Dehallucinative Instruction Generation: Know What You Don't Know

التفاصيل البيبلوغرافية
العنوان: Visually Dehallucinative Instruction Generation: Know What You Don't Know
المؤلفون: Cha, Sungguk, Lee, Jusung, Lee, Younghyun, Yang, Cheoljong
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: "When did the emperor Napoleon invented iPhone?" Such hallucination-inducing question is well known challenge in generative language modeling. In this study, we present an innovative concept of visual hallucination, referred to as "I Know (IK)" hallucination, to address scenarios where "I Don't Know" is the desired response. To effectively tackle this issue, we propose the VQAv2-IDK benchmark, the subset of VQAv2 comprising unanswerable image-question pairs as determined by human annotators. Stepping further, we present the visually dehallucinative instruction generation method for IK hallucination and introduce the IDK-Instructions visual instruction database. Our experiments show that current methods struggle with IK hallucination. Yet, our approach effectively reduces these hallucinations, proving its versatility across different frameworks and datasets.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2402.09717
رقم الأكسشن: edsarx.2402.09717
قاعدة البيانات: arXiv