CMC2R: Cross‐modal collaborative contextual representation for RGBT tracking

التفاصيل البيبلوغرافية
العنوان:	CMC2R: Cross‐modal collaborative contextual representation for RGBT tracking
المؤلفون:	Xiaohu Liu, Yichuang Luo, Keding Yan, Jianfei Chen, Zhiyong Lei
المصدر:	IET Image Processing, Vol 16, Iss 5, Pp 1500-1510 (2022)
بيانات النشر:	Wiley, 2022.
سنة النشر:	2022
المجموعة:	LCC:Computer software
مصطلحات موضوعية:	Photography, TR1-1050, Computer software, QA76.75-76.765
الوصف:	Abstract The key challenge in RBGT tracking is how to fuse dual‐modality information to build a robust RGB‐T tracker. Motivated by CNN structure for local features, and visual transformer structure for global representations, the authors propose a two‐stream hybrid structure, termed CMC2R, to take advantage of convolutional operations and self‐attention mechanisms to lean the enhanced representation. CMC2R fuses local features and global representations under different resolutions through the transformer layer of the encoder block, and the two modalities are collaborated to get contextual information by the spatial and channel self‐attention. The temporal association is performed with the track query, each track query models the entire track of an object, and updated frame‐by‐frame to build the long‐range temporal relation. Experimental results show the effectiveness of the proposed method, and achieve the SOTAs performance.
نوع الوثيقة:	article
وصف الملف:	electronic resource
اللغة:	English
تدمد:	1751-9667 1751-9659 62747150
Relation:	https://doaj.org/toc/1751-9659; https://doaj.org/toc/1751-9667
DOI:	10.1049/ipr2.12427
URL الوصول:	https://doaj.org/article/969685db59d842e8b62747150b88a7ed
رقم الأكسشن:	edsdoj.969685db59d842e8b62747150b88a7ed
قاعدة البيانات:	Directory of Open Access Journals

View record at Wiley

Full Text Finder

الوصف
تدمد:	17519667 17519659 62747150
DOI:	10.1049/ipr2.12427