Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature

التفاصيل البيبلوغرافية
العنوان: Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature
المؤلفون: de Faria, Ana Cláudia Akemi Matsuki, Bastos, Felype de Castro, da Silva, José Victor Nogueira Alves, Fabris, Vitor Lopes, Uchoa, Valeska de Sousa, Neto, Décio Gonçalves de Aguiar, Santos, Claudio Filipi Goncalves dos
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
الوصف: Visual Question Answering (VQA) is an emerging area of interest for researches, being a recent problem in natural language processing and image prediction. In this area, an algorithm needs to answer questions about certain images. As of the writing of this survey, 25 recent studies were analyzed. Besides, 6 datasets were analyzed and provided their link to download. In this work, several recent pieces of research in this area were investigated and a deeper analysis and comparison among them were provided, including results, the state-of-the-art, common errors, and possible points of improvement for future researchers.
Comment: 30 pages. arXiv admin note: text overlap with arXiv:2104.00926, arXiv:2110.02526, arXiv:2108.02059, arXiv:1908.01801 by other authors
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2305.11033
رقم الأكسشن: edsarx.2305.11033
قاعدة البيانات: arXiv