DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition

التفاصيل البيبلوغرافية
العنوان: DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition
المؤلفون: Jing, Xin, Zhang, Luyang, Xie, Jiangjian, Gebhard, Alexander, Baird, Alice, Schuller, Bjoern
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
الوصف: In ornithology, bird species are known to have variedit's widely acknowledged that bird species display diverse dialects in their calls across different regions. Consequently, computational methods to identify bird species onsolely through their calls face critsignificalnt challenges. There is growing interest in understanding the impact of species-specific dialects on the effectiveness of bird species recognition methods. Despite potential mitigation through the expansion of dialect datasets, the absence of publicly available testing data currently impedes robust benchmarking efforts. This paper presents the Dialect Dominated Dataset of Bird Vocalisation, the first cross-corpus dataset that focuses on dialects in bird vocalisations. The DB3V comprises more than 25 hours of audio recordings from 10 bird species distributed across three distinct regions in the contiguous United States (CONUS). In addition to presenting the dataset, we conduct analyses and establish baseline models for cross-corpus bird recognition. The data and code are publicly available online: https://zenodo.org/records/11544734
Comment: accepted by Interspeech 2024
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2406.08517
رقم الأكسشن: edsarx.2406.08517
قاعدة البيانات: arXiv