Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge

التفاصيل البيبلوغرافية
العنوان: Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge
المؤلفون: Shen, Kang, Liu, Xuxiong, Wang, Boyan, Yao, Jun, Liu, Xin, Guan, Yujie, Wang, Yu, Li, Gengchen, Sun, Xiao
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: In this paper, we present our approach to addressing the challenges of the 7th ABAW competition. The competition comprises three sub-challenges: Valence Arousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU) detection. To tackle these challenges, we employ state-of-the-art models to extract powerful visual features. Subsequently, a Transformer Encoder is utilized to integrate these features for the VA, Expr, and AU sub-challenges. To mitigate the impact of varying feature dimensions, we introduce an affine module to align the features to a common dimension. Overall, our results significantly outperform the baselines.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.12258
رقم الأكسشن: edsarx.2407.12258
قاعدة البيانات: arXiv