دورية أكاديمية

SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network

التفاصيل البيبلوغرافية
العنوان: SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network
المؤلفون: Rongchuang Lv, Niansheng Chen, Songlin Cheng, Guangyu Fan, Lei Rao, Xiaoyong Song, Wenjing Lv, Dingyu Yang
المصدر: Mathematical Biosciences and Engineering, Vol 21, Iss 3, Pp 3860-3875 (2024)
بيانات النشر: AIMS Press, 2024.
سنة النشر: 2024
المجموعة: LCC:Biotechnology
LCC:Mathematics
مصطلحات موضوعية: speech enhancement, deep learning, generative adversarial network, autoencoder, Biotechnology, TP248.13-248.65, Mathematics, QA1-939
الوصف: Traditional unsupervised speech enhancement models often have problems such as non-aggregation of input feature information, which will introduce additional noise during training, thereby reducing the quality of the speech signal. In order to solve the above problems, this paper analyzed the impact of problems such as non-aggregation of input speech feature information on its performance. Moreover, this article introduced a temporal convolutional neural network and proposed a SASEGAN-TCN speech enhancement model, which captured local features information and aggregated global feature information to improve model effect and training stability. The simulation experiment results showed that the model can achieve 2.1636 and 92.78% in perceptual evaluation of speech quality (PESQ) score and short-time objective intelligibility (STOI) on the Valentini dataset, and can accordingly reach 1.8077 and 83.54% on the THCHS30 dataset. In addition, this article used the enhanced speech data for the acoustic model to verify the recognition accuracy. The speech recognition error rate was reduced by 17.4%, which was a significant improvement compared to the baseline model experimental results.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1551-0018
Relation: https://doaj.org/toc/1551-0018
DOI: 10.3934/mbe.2024172?viewType=HTML
DOI: 10.3934/mbe.2024172
URL الوصول: https://doaj.org/article/a9dbf8ad8fca49fd8c73fc272df14e13
رقم الأكسشن: edsdoj.9dbf8ad8fca49fd8c73fc272df14e13
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:15510018
DOI:10.3934/mbe.2024172?viewType=HTML