Exploring Long- and Short-Range Temporal Information for Learned Video Compression

التفاصيل البيبلوغرافية
العنوان: Exploring Long- and Short-Range Temporal Information for Learned Video Compression
المؤلفون: Wang, Huairui, Chen, Zhenzhong
سنة النشر: 2022
المجموعة: Computer Science
مصطلحات موضوعية: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
الوصف: Learned video compression methods have gained a variety of interest in the video coding community since they have matched or even exceeded the rate-distortion (RD) performance of traditional video codecs. However, many current learning-based methods are dedicated to utilizing short-range temporal information, thus limiting their performance. In this paper, we focus on exploiting the unique characteristics of video content and further exploring temporal information to enhance compression performance. Specifically, for long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference. In that case temporal prior contains valuable temporal information of all decoded images within the current GOP. As for short-range temporal information, we propose a progressive guided motion compensation to achieve robust and effective compensation. In detail, we design a hierarchical structure to achieve multi-scale compensation. More importantly, we use optical flow guidance to generate pixel offsets between feature maps at each scale, and the compensation results at each scale will be used to guide the following scale's compensation. Sufficient experimental results demonstrate that our method can obtain better RD performance than state-of-the-art video compression approaches. The code is publicly available on: https://github.com/Huairui/LSTVC.
Comment: arXiv admin note: text overlap with arXiv:2207.04589
نوع الوثيقة: Working Paper
DOI: 10.1109/TIP.2024.3349859
URL الوصول: http://arxiv.org/abs/2208.03754
رقم الأكسشن: edsarx.2208.03754
قاعدة البيانات: arXiv
الوصف
DOI:10.1109/TIP.2024.3349859