MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features

التفاصيل البيبلوغرافية
العنوان: MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features
المؤلفون: Wagner, Royden, Klemp, Marvin, Lopez, Carlos Fernandez
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
الوصف: In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data. Therefore, state-of-the-art methods for perception in urban scenes fuse data from both sensor types. In this work, we introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications. We build upon masked autoencoders (MAEs) and train deep learning models to reconstruct masked LiDAR data from fused LiDAR and camera features. In contrast to related methods that use birds-eye-view representations, we fuse features from dense spherical LiDAR projections and features from fish-eye camera crops with a similar field of view. Therefore, we reduce the learned spatial transformations to moderate perspective transformations and do not require additional modules to generate dense LiDAR representations. Code is available at: https://github.com/KIT-MRT/masked-fusion-360
Comment: Technical report, 6 pages, 4 figures, accepted at ICLR 2023 Tiny Papers
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2306.07087
رقم الأكسشن: edsarx.2306.07087
قاعدة البيانات: arXiv