Attention does not guarantee best performance in speech enhancement

التفاصيل البيبلوغرافية
العنوان: Attention does not guarantee best performance in speech enhancement
المؤلفون: Hou, Zhongshu, Hu, Qinwen, Chen, Kai, Lu, Jing
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the long-term inherent connection of signal both in time domain and spectrum domain. However, the generally used global attention mechanism might not be the best choice since the adjacent information naturally imposes more influence than the far-apart information in speech enhancement. In this paper, we validate this conjecture by replacing attention with RNN in two typical state-of-the-art (SOTA) models, multi-scale temporal frequency convolutional network (MTFAA) with axial attention and conformer-based metric-GAN network (CMGAN).
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2302.05690
رقم الأكسشن: edsarx.2302.05690
قاعدة البيانات: arXiv