دورية أكاديمية

DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data

التفاصيل البيبلوغرافية
العنوان: DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data
المؤلفون: Vandet Pann, Kyeong-seok Kwon, Byeonghyeon Kim, Dong-Hwa Jang, Jong-Bok Kim
المصدر: Animals, Vol 14, Iss 14, p 2029 (2024)
بيانات النشر: MDPI AG, 2024.
سنة النشر: 2024
المجموعة: LCC:Veterinary medicine
LCC:Zoology
مصطلحات موضوعية: audio classification, audio feature extraction, pig vocalization, smart farming, audio data augmentation, machine learning, Veterinary medicine, SF600-1100, Zoology, QL1-991
الوصف: Since pig vocalization is an important indicator of monitoring pig conditions, pig vocalization detection and recognition using deep learning play a crucial role in the management and welfare of modern pig livestock farming. However, collecting pig sound data for deep learning model training takes time and effort. Acknowledging the challenges of collecting pig sound data for model training, this study introduces a deep convolutional neural network (DCNN) architecture for pig vocalization and non-vocalization classification with a real pig farm dataset. Various audio feature extraction methods were evaluated individually to compare the performance differences, including Mel-frequency cepstral coefficients (MFCC), Mel-spectrogram, Chroma, and Tonnetz. This study proposes a novel feature extraction method called Mixed-MMCT to improve the classification accuracy by integrating MFCC, Mel-spectrogram, Chroma, and Tonnetz features. These feature extraction methods were applied to extract relevant features from the pig sound dataset for input into a deep learning network. For the experiment, three datasets were collected from three actual pig farms: Nias, Gimje, and Jeongeup. Each dataset consists of 4000 WAV files (2000 pig vocalization and 2000 pig non-vocalization) with a duration of three seconds. Various audio data augmentation techniques are utilized in the training set to improve the model performance and generalization, including pitch-shifting, time-shifting, time-stretching, and background-noising. In this study, the performance of the predictive deep learning model was assessed using the k-fold cross-validation (k = 5) technique on each dataset. By conducting rigorous experiments, Mixed-MMCT showed superior accuracy on Nias, Gimje, and Jeongeup, with rates of 99.50%, 99.56%, and 99.67%, respectively. Robustness experiments were performed to prove the effectiveness of the model by using two farm datasets as a training set and a farm as a testing set. The average performance of the Mixed-MMCT in terms of accuracy, precision, recall, and F1-score reached rates of 95.67%, 96.25%, 95.68%, and 95.96%, respectively. All results demonstrate that the proposed Mixed-MMCT feature extraction method outperforms other methods regarding pig vocalization and non-vocalization classification in real pig livestock farming.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2076-2615
Relation: https://www.mdpi.com/2076-2615/14/14/2029; https://doaj.org/toc/2076-2615
DOI: 10.3390/ani14142029
URL الوصول: https://doaj.org/article/7e84e187066b4dd999e007a28f3536b8
رقم الأكسشن: edsdoj.7e84e187066b4dd999e007a28f3536b8
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:20762615
DOI:10.3390/ani14142029