MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder

التفاصيل البيبلوغرافية
العنوان: MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder
المؤلفون: Li, You-Jin, Wang, Syu-Siang, Tsao, Yu, Su, Borching
سنة النشر: 2020
مصطلحات موضوعية: Electrical Engineering and Systems Science - Audio and Speech Processing, Electrical Engineering and Systems Science - Signal Processing
الوصف: For speech-related applications in IoT environments, identifying effective methods to handle interference noises and compress the amount of data in transmissions is essential to achieve high-quality services. In this study, we propose a novel multi-input multi-output speech compression and enhancement (MIMO-SCE) system based on a convolutional denoising autoencoder (CDAE) model to simultaneously improve speech quality and reduce the dimensions of transmission data. Compared with conventional single-channel and multi-input single-output systems, MIMO systems can be employed in applications that handle multiple acoustic signals need to be handled. We investigated two CDAE models, a fully convolutional network (FCN) and a Sinc FCN, as the core models in MIMO systems. The experimental results confirm that the proposed MIMO-SCE framework effectively improves speech quality and intelligibility while reducing the amount of recording data by a factor of 7 for transmission.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2005.11704
رقم الأكسشن: edsarx.2005.11704
قاعدة البيانات: arXiv