Towards Online Real-Time Memory-based Video Inpainting Transformers

التفاصيل البيبلوغرافية
العنوان: Towards Online Real-Time Memory-based Video Inpainting Transformers
المؤلفون: Thiry, Guillaume, Tang, Hao, Timofte, Radu, Van Gool, Luc
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: Video inpainting tasks have seen significant improvements in recent years with the rise of deep neural networks and, in particular, vision transformers. Although these models show promising reconstruction quality and temporal consistency, they are still unsuitable for live videos, one of the last steps to make them completely convincing and usable. The main limitations are that these state-of-the-art models inpaint using the whole video (offline processing) and show an insufficient frame rate. In our approach, we propose a framework to adapt existing inpainting transformers to these constraints by memorizing and refining redundant computations while maintaining a decent inpainting quality. Using this framework with some of the most recent inpainting models, we show great online results with a consistent throughput above 20 frames per second. The code and pretrained models will be made available upon acceptance.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2403.16161
رقم الأكسشن: edsarx.2403.16161
قاعدة البيانات: arXiv