AISPEECH-SJTU accent identification system for the Accented English Speech Recognition Challenge

التفاصيل البيبلوغرافية
العنوان: AISPEECH-SJTU accent identification system for the Accented English Speech Recognition Challenge
المؤلفون: Yexin Yang, Xu Xiang, Houjun Huang, Rao Ma, Yanmin Qian
المصدر: ICASSP
سنة النشر: 2021
مصطلحات موضوعية: FOS: Computer and information sciences, Sound (cs.SD), Training set, Computer science, Speech recognition, Pipeline (software), Computer Science - Sound, Identification system, Identification (information), Margin (machine learning), Phone, Audio and Speech Processing (eess.AS), Stress (linguistics), Feature (machine learning), FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: This paper describes the AISpeech-SJTU system for the accent identification track of the Interspeech-2020 Accented English Speech Recognition Challenge. In this challenge track, only 160-hour accented English data collected from 8 countries and the auxiliary Librispeech dataset are provided for training. To build an accurate and robust accent identification system, we explore the whole system pipeline in detail. First, we introduce the ASR based phone posteriorgram (PPG) feature to accent identification and verify its efficacy. Then, a novel TTS based approach is carefully designed to augment the very limited accent training data for the first time. Finally, we propose the test time augmentation and embedding fusion schemes to further improve the system performance. Our final system is ranked first in the challenge and outperforms all the other participants by a large margin. The submitted system achieves 83.63\% average accuracy on the challenge evaluation data, ahead of the others by more than 10\% in absolute terms.
Accepted to ICASSP 2021
اللغة: English
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dae91e6747774c8242c59a0171e75ea9
http://arxiv.org/abs/2102.09828
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....dae91e6747774c8242c59a0171e75ea9
قاعدة البيانات: OpenAIRE