Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling

التفاصيل البيبلوغرافية
العنوان:	Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling
المؤلفون:	Boros, Tiberiu, Dumitrescu, Stefan Daniel, Mironica, Ionut, Chivereanu, Radu
سنة النشر:	2023
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Machine Learning
الوصف:	We describe an end-to-end speech synthesis system that uses generative adversarial training. We train our Vocoder for raw phoneme-to-audio conversion, using explicit phonetic, pitch and duration modeling. We experiment with several pre-trained models for contextualized and decontextualized word embeddings and we introduce a new method for highly expressive character voice matching, based on discreet style tokens.
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2310.09636
رقم الأكسشن:	edsarx.2310.09636
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.