Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces

التفاصيل البيبلوغرافية
العنوان: Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces
المؤلفون: Chen, Zhaohui, Shamsabadi, Elyas Asadi, Jiang, Sheng, Shen, Luming, Dias-da-Costa, Daniel
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: Convolutional neural networks (CNNs) and Transformers have shown advanced accuracy in crack detection under certain conditions. Yet, the fixed local attention can compromise the generalisation of CNNs, and the quadratic complexity of the global self-attention restricts the practical deployment of Transformers. Given the emergence of the new-generation architecture of Mamba, this paper proposes a Vision Mamba (VMamba)-based framework for crack segmentation on concrete, asphalt, and masonry surfaces, with high accuracy, generalisation, and less computational complexity. Having 15.6% - 74.5% fewer parameters, the encoder-decoder network integrated with VMamba could obtain up to 2.8% higher mDS than representative CNN-based models while showing about the same performance as Transformer-based models. Moreover, the VMamba-based encoder-decoder network could process high-resolution image input with up to 90.6% lower floating-point operations.
Comment: 23 pages, 9 figures
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2406.16518
رقم الأكسشن: edsarx.2406.16518
قاعدة البيانات: arXiv