Improving Real-Time Music Accompaniment Separation with MMDenseNet

التفاصيل البيبلوغرافية
العنوان: Improving Real-Time Music Accompaniment Separation with MMDenseNet
المؤلفون: Wang, Chun-Hsiang, Wang, Chung-Che, Wang, Jun-You, Jang, Jyh-Shing Roger, Chu, Yen-Hsun
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Sound, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: Music source separation aims to separate polyphonic music into different types of sources. Most existing methods focus on enhancing the quality of separated results by using a larger model structure, rendering them unsuitable for deployment on edge devices. Moreover, these methods may produce low-quality output when the input duration is short, making them impractical for real-time applications. Therefore, the goal of this paper is to enhance a lightweight model, MMDenstNet, to strike a balance between separation quality and latency for real-time applications. Different directions of improvement are explored or proposed in this paper, including complex ideal ratio mask, self-attention, band-merge-split method, and feature look back. Source-to-distortion ratio, real-time factor, and optimal latency are employed to evaluate the performance. To align with our application requirements, the evaluation process in this paper focuses on the separation performance of the accompaniment part. Experimental results demonstrate that our improvement achieves low real-time factor and optimal latency while maintaining acceptable separation quality.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.00657
رقم الأكسشن: edsarx.2407.00657
قاعدة البيانات: arXiv