دورية أكاديمية

A transfer learning coupled framework for distortion classification in laparoscopic videos.

التفاصيل البيبلوغرافية
العنوان: A transfer learning coupled framework for distortion classification in laparoscopic videos.
المؤلفون: Konduri, Praveen SR, Rao, G Siva Nageswara
المصدر: Multimedia Tools & Applications; May2024, Vol. 83 Issue 15, p45947-45968, 22p
مصطلحات موضوعية: CONVOLUTIONAL neural networks, LAPAROSCOPIC surgery, PEARSON correlation (Statistics), TUNNEL ventilation, FEATURE extraction, VIDEO codecs, VIDEO surveillance, CLASSIFICATION
مصطلحات جغرافية: KENDALL (Fla.)
مستخلص: Capturing laparoscopic videos helps physicians to conduct minor surgeries and treatments effectively. But the problem is that these videos are easily affected by environmental conditions and various distortions that diminish the overall clarity. This reduces the possibility for physicians to complete the treatments successfully. To deal with this, video enhancement techniques are introduced, which require the help of effective distortion-type classification strategies. This work presents a compelling and accurate distortion classification technique based on transfer learning. The proposed work includes three main stages: feature extraction, fine-tuning and classification. The DenseNet-65 convolutional neural network (DDCNN) model has been chosen as the baseline, where the DenseNet-65 is pre-trained on the Imagenet dataset. The pre-trained model is used for feature extraction using the zero-shot transfer learning (ZSTL) technique, where only the first few layers of a model are engaged to extract the crucial spatial features. Then, the fine-tuning process was carried out using the flow direction algorithm (FDA) that tunes the parameters of the top layers. Finally, classification has been done using the softmax classifier, where the model classifies five different distortions in videos such as smoke, AWGN noise, motion blur, defocus blur and uneven illumination. The work has been implemented in Python, and the ICIP2020 challenge dataset is used for evaluations. The achieved accuracy outcomes of different distortion classes are smoke (97.8%), AWGN noise (100%), Motion blur (95.83%), Defocus blur (98.65%) and uneven illumination (99.01%). Moreover, the performance of a proposed scheme is examined in terms of different performance measures accuracy (98.8%), F1-score (96.9%), processing time (0.037 s), Spearman rank order correlation coefficient SROCC (0.995), Pearson linear correlation coefficient (PLCC) (0.995), and Kendall rank order correlation coefficient (KROCC) (0.857). The performance evaluations proved the efficacy of the proposed method compared to other techniques. [ABSTRACT FROM AUTHOR]
Copyright of Multimedia Tools & Applications is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:13807501
DOI:10.1007/s11042-023-17257-x