OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning

التفاصيل البيبلوغرافية
العنوان: OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning
المؤلفون: Geng, Xinyu, Wang, Jiaming, Gong, Jiawei, Xue, Yuerong, Xu, Jun, Chen, Fanglin, Huang, Xiaolin
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition
الوصف: Redundancy is a persistent challenge in Capsule Networks (CapsNet),leading to high computational costs and parameter counts. Although previous works have introduced pruning after the initial capsule layer, dynamic routing's fully connected nature and non-orthogonal weight matrices reintroduce redundancy in deeper layers. Besides, dynamic routing requires iterating to converge, further increasing computational demands. In this paper, we propose an Orthogonal Capsule Network (OrthCaps) to reduce redundancy, improve routing performance and decrease parameter counts. Firstly, an efficient pruned capsule layer is introduced to discard redundant capsules. Secondly, dynamic routing is replaced with orthogonal sparse attention routing, eliminating the need for iterations and fully connected structures. Lastly, weight matrices during routing are orthogonalized to sustain low capsule similarity, which is the first approach to introduce orthogonality into CapsNet as far as we know. Our experiments on baseline datasets affirm the efficiency and robustness of OrthCaps in classification tasks, in which ablation studies validate the criticality of each component. Remarkably, OrthCaps-Shallow outperforms other Capsule Network benchmarks on four datasets, utilizing only 110k parameters, which is a mere 1.25% of a standard Capsule Network's total. To the best of our knowledge, it achieves the smallest parameter count among existing Capsule Networks. Similarly, OrthCaps-Deep demonstrates competitive performance across four datasets, utilizing only 1.2% of the parameters required by its counterparts.
Comment: 8 pages
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2403.13351
رقم الأكسشن: edsarx.2403.13351
قاعدة البيانات: arXiv