دورية أكاديمية

Semantic segmentation via pixel‐to‐center similarity calculation

التفاصيل البيبلوغرافية
العنوان: Semantic segmentation via pixel‐to‐center similarity calculation
المؤلفون: Dongyue Wu, Zilin Guo, Aoyan Li, Changqian Yu, Nong Sang, Changxin Gao
المصدر: CAAI Transactions on Intelligence Technology, Vol 9, Iss 1, Pp 87-100 (2024)
بيانات النشر: Wiley, 2024.
سنة النشر: 2024
المجموعة: LCC:Computational linguistics. Natural language processing
LCC:Computer software
مصطلحات موضوعية: computer vision, deep neural networks, image segmentation, scene understanding, Computational linguistics. Natural language processing, P98-98.5, Computer software, QA76.75-76.765
الوصف: Abstract Since the fully convolutional network has achieved great success in semantic segmentation, lots of works have been proposed to extract discriminative pixel representations. However, the authors observe that existing methods still suffer from two typical challenges: (i) The intra‐class feature variation between different scenes may be large, leading to the difficulty in maintaining the consistency between same‐class pixels from different scenes; (ii) The inter‐class feature distinction in the same scene could be small, resulting in the limited performance to distinguish different classes in each scene. The authors first rethink semantic segmentation from a perspective of similarity between pixels and class centers. Each weight vector of the segmentation head represents its corresponding semantic class in the whole dataset, which can be regarded as the embedding of the class center. Thus, the pixel‐wise classification amounts to computing similarity in the final feature space between pixels and the class centers. Under this novel view, the authors propose a Class Center Similarity (CCS) layer to address the above‐mentioned challenges by generating adaptive class centers conditioned on each scenes and supervising the similarities between class centers. The CCS layer utilises the Adaptive Class Center Module to generate class centers conditioned on each scene, which adapt the large intra‐class variation between different scenes. Specially designed Class Distance Loss (CD Loss) is introduced to control both inter‐class and intra‐class distances based on the predicted center‐to‐center and pixel‐to‐center similarity. Finally, the CCS layer outputs the processed pixel‐to‐center similarity as the segmentation prediction. Extensive experiments demonstrate that our model performs favourably against the state‐of‐the‐art methods.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2468-2322
Relation: https://doaj.org/toc/2468-2322
DOI: 10.1049/cit2.12245
URL الوصول: https://doaj.org/article/4506b2ad08934b96a875adb82d768154
رقم الأكسشن: edsdoj.4506b2ad08934b96a875adb82d768154
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:24682322
DOI:10.1049/cit2.12245