DiDiSpeech: A Large Scale Mandarin Speech Corpus

التفاصيل البيبلوغرافية
العنوان: DiDiSpeech: A Large Scale Mandarin Speech Corpus
المؤلفون: Dongwei Jiang, Wubo Li, Kun Han, Tingwei Guo, Wei Zou, Ruixiong Zhang, Xiangang Li, Cheng Gong, Cheng Wen, Shuaijiang Zhao, Ne Luo
المصدر: ICASSP
سنة النشر: 2020
مصطلحات موضوعية: Audio and Speech Processing (eess.AS), Computer science, Speech recognition, QUIET, FOS: Electrical engineering, electronic engineering, information engineering, Task analysis, language, Speech corpus, Scale (music), Speech processing, Mandarin Chinese, language.human_language, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours of speech data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All speech data in the corpus is recorded in quiet environment and is suitable for various speech processing tasks, such as voice conversion, multi-speaker text-to-speech and automatic speech recognition. We conduct experiments with multiple speech tasks and evaluate the performance, showing that it is promising to use the corpus for both academic research and practical application. The corpus is available at https://outreach.didichuxing.com/research/opendata/.
5 pages, 2 figures, 11 tables
اللغة: English
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d3f1c677a8abaf18bfb2ca8aa11f6f09
http://arxiv.org/abs/2010.09275
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....d3f1c677a8abaf18bfb2ca8aa11f6f09
قاعدة البيانات: OpenAIRE