Vehicle Detection Based on Adaptive Multimodal Feature Fusion and Cross-Modal Vehicle Index Using RGB-T Images

التفاصيل البيبلوغرافية
العنوان:	Vehicle Detection Based on Adaptive Multimodal Feature Fusion and Cross-Modal Vehicle Index Using RGB-T Images
المؤلفون:	Yuanfeng Wu, Xinran Guan, Boya Zhao, Li Ni, Min Huang
المصدر:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol 16, Pp 8166-8177 (2023)
بيانات النشر:	IEEE, 2023.
سنة النشر:	2023
المجموعة:	LCC:Ocean engineering LCC:Geophysics. Cosmic physics
مصطلحات موضوعية:	Adaptive feature fusion, aerial images, channel attention, cross-modal vehicle index, vehicle detection, Ocean engineering, TC1501-1800, Geophysics. Cosmic physics, QC801-809
الوصف:	Target detection is a critical task in interpreting aerial images. Small target detection, such as vehicles, is challenging. Different lighting conditions affect the accuracy of vehicle detection. For example, vehicles are difficult to distinguish from the background in red, green, blue (RGB) images under low illumination conditions. In contrast, under high-illumination conditions, the color and texture of vehicles are not significantly different in thermal infrared (TIR) images. To improve the accuracy of vehicle detection under various illumination conditions, we propose an adaptive multimodal feature fusion and cross-modal vehicle index (AFFCM) model for vehicle detection. Based on the single-stage object detection model, AFFCM uses RGB and TIR images. It comprises three parts: 1) the softpooling channel attention (SCA) mechanism calculates the cross-modal feature weights of the RGB and TIR features using a fully connected layer during global weighted pooling; 2) we design a multimodal adaptive feature fusion (MAFF) module based on the cross-modal feature weights derived from the SCA mechanism; the MAFF selects features with high weight, compresses redundant features with low weight, and performs adaptive fusion using a multiscale feature pyramid; and 3) a cross-modal vehicle index is established to extract the target area, suppress complex background information, and minimize false alarms in vehicle detection. The mean average precision (mAP) on the Drone Vehicle dataset is 14.44% and 5.02% higher than that obtained using only RGB or TIR images. The mAP is 2.63% higher than that of state-of-the-art methods that utilize RGB and TIR images.
نوع الوثيقة:	article
وصف الملف:	electronic resource
اللغة:	English
تدمد:	2151-1535
Relation:	https://ieeexplore.ieee.org/document/10179923/; https://doaj.org/toc/2151-1535
DOI:	10.1109/JSTARS.2023.3294624
URL الوصول:	https://doaj.org/article/86caf3f2a2b14adf8d89ce6454da81ed
رقم الأكسشن:	edsdoj.86caf3f2a2b14adf8d89ce6454da81ed
قاعدة البيانات:	Directory of Open Access Journals

Full Text Finder

الوصف
تدمد:	21511535
DOI:	10.1109/JSTARS.2023.3294624