Deep Generic Representations for Domain-Generalized Anomalous Sound Detection

التفاصيل البيبلوغرافية
العنوان: Deep Generic Representations for Domain-Generalized Anomalous Sound Detection
المؤلفون: Saengthong, Phurich, Shinozaki, Takahiro
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Sound, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: Developing a reliable anomalous sound detection (ASD) system requires robustness to noise, adaptation to domain shifts, and effective performance with limited training data. Current leading methods rely on extensive labeled data for each target machine type to train feature extractors using Outlier-Exposure (OE) techniques, yet their performance on the target domain remains sub-optimal. In this paper, we present \textit{GenRep}, which utilizes generic feature representations from a robust, large-scale pre-trained feature extractor combined with kNN for domain-generalized ASD, without the need for fine-tuning. \textit{GenRep} incorporates MemMixup, a simple approach for augmenting the target memory bank using nearest source samples, paired with a domain normalization technique to address the imbalance between source and target domains. \textit{GenRep} outperforms the best OE-based approach without a need for labeled data with an Official Score of 73.79\% on the DCASE2023T2 Eval set and demonstrates robustness under limited data scenarios. The code is available open-source.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2409.05035
رقم الأكسشن: edsarx.2409.05035
قاعدة البيانات: arXiv