An RFP dataset for Real, Fake, and Partially fake audio detection

التفاصيل البيبلوغرافية
العنوان:	An RFP dataset for Real, Fake, and Partially fake audio detection
المؤلفون:	AlAli, Abdulazeez, Theodorakopoulos, George
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Sound, Computer Science - Cryptography and Security, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف:	Recent advances in deep learning have enabled the creation of natural-sounding synthesised speech. However, attackers have also utilised these tech-nologies to conduct attacks such as phishing. Numerous public datasets have been created to facilitate the development of effective detection models. How-ever, available datasets contain only entirely fake audio; therefore, detection models may miss attacks that replace a short section of the real audio with fake audio. In recognition of this problem, the current paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real. The data are then used to evaluate several detection models, revealing that the available detec-tion models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio. The lowest EER recorded was 25.42%. Therefore, we believe that creators of detection models must seriously consid-er using datasets like RFP that include PF and other types of fake audio.
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2404.17721
رقم الأكسشن:	edsarx.2404.17721
قاعدة البيانات:	arXiv

الوصف
الوصف غير متاح.