Priorformer: A UGC-VQA Method with content and distortion priors

التفاصيل البيبلوغرافية
العنوان: Priorformer: A UGC-VQA Method with content and distortion priors
المؤلفون: Pei, Yajing, Huang, Shiyu, Lu, Yiting, Li, Xin, Chen, Zhibo
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
الوصف: User Generated Content (UGC) videos are susceptible to complicated and variant degradations and contents, which prevents the existing blind video quality assessment (BVQA) models from good performance since the lack of the adapability of distortions and contents. To mitigate this, we propose a novel prior-augmented perceptual vision transformer (PriorFormer) for the BVQA of UGC, which boots its adaptability and representation capability for divergent contents and distortions. Concretely, we introduce two powerful priors, i.e., the content and distortion priors, by extracting the content and distortion embeddings from two pre-trained feature extractors. Then we adopt these two powerful embeddings as the adaptive prior tokens, which are transferred to the vision transformer backbone jointly with implicit quality features. Based on the above strategy, the proposed PriorFormer achieves state-of-the-art performance on three public UGC VQA datasets including KoNViD-1K, LIVE-VQC and YouTube-UGC.
Comment: 7 pages
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2406.16297
رقم الأكسشن: edsarx.2406.16297
قاعدة البيانات: arXiv