A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation

التفاصيل البيبلوغرافية
العنوان: A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation
المؤلفون: Fernandes, Jose Geraldo, Nascimento, Sinval, Dominguete, Daniel, Oliveira, André, Rotsen, Lucas, Souza, Gabriel, Brochero, David, Facury, Luiz, Vilela, Mateus, Costa, Hebert, Coelho, Frederico, Braga, Antônio P.
سنة النشر: 2024
مصطلحات موضوعية: Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: In many applications, synchronizing audio with visuals is crucial, such as in creating graphic animations for films or games, translating movie audio into different languages, and developing metaverse applications. This review explores various methodologies for achieving realistic facial animations from audio inputs, highlighting generative and adaptive models. Addressing challenges like model training costs, dataset availability, and silent moment distributions in audio data, it presents innovative solutions to enhance performance and realism. The research also introduces a new taxonomy to categorize audio-visual synchronization methods based on logistical aspects, advancing the capabilities of virtual assistants, gaming, and interactive digital media.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.17430
رقم الأكسشن: edsarx.2407.17430
قاعدة البيانات: arXiv