Scalable SoftGroup for 3D Instance Segmentation on Point Clouds

التفاصيل البيبلوغرافية
العنوان: Scalable SoftGroup for 3D Instance Segmentation on Point Clouds
المؤلفون: Vu, Thang, Kim, Kookhoi, Luu, Tung M., Nguyen, Thanh, Kim, Junyeong, Yoo, Chang D.
سنة النشر: 2022
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: This paper considers a network referred to as SoftGroup for accurate and scalable 3D instance segmentation. Existing state-of-the-art methods produce hard semantic predictions followed by grouping instance segmentation results. Unfortunately, errors stemming from hard decisions propagate into the grouping, resulting in poor overlap between predicted instances and ground truth and substantial false positives. To address the abovementioned problems, SoftGroup allows each point to be associated with multiple classes to mitigate the uncertainty stemming from semantic prediction. It also suppresses false positive instances by learning to categorize them as background. Regarding scalability, the existing fast methods require computational time on the order of tens of seconds on large-scale scenes, which is unsatisfactory and far from applicable for real-time. Our finding is that the $k$-Nearest Neighbor ($k$-NN) module, which serves as the prerequisite of grouping, introduces a computational bottleneck. SoftGroup is extended to resolve this computational bottleneck, referred to as SoftGroup++. The proposed SoftGroup++ reduces time complexity with octree $k$-NN and reduces search space with class-aware pyramid scaling and late devoxelization. Experimental results on various indoor and outdoor datasets demonstrate the efficacy and generality of the proposed SoftGroup and SoftGroup++. Their performances surpass the best-performing baseline by a large margin (6\% $\sim$ 16\%) in terms of AP$_{50}$. On datasets with large-scale scenes, SoftGroup++ achieves a 6$\times$ speed boost on average compared to SoftGroup. Furthermore, SoftGroup can be extended to perform object detection and panoptic segmentation with nontrivial improvements over existing methods. The source code and trained models are available at \url{https://github.com/thangvubk/SoftGroup}.
Comment: Accepted by TPAMI. Extension of arXiv:2203.01509
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2209.08263
رقم الأكسشن: edsarx.2209.08263
قاعدة البيانات: arXiv