دورية أكاديمية

Gamified Crowdsourcing as a Novel Approach to Lung Ultrasound Data Set Labeling: Prospective Analysis

التفاصيل البيبلوغرافية
العنوان: Gamified Crowdsourcing as a Novel Approach to Lung Ultrasound Data Set Labeling: Prospective Analysis
المؤلفون: Nicole M Duggan, Mike Jin, Maria Alejandra Duran Mendicuti, Stephen Hallisey, Denie Bernier, Lauren A Selame, Ameneh Asgari-Targhi, Chanel E Fischetti, Ruben Lucassen, Anthony E Samir, Erik Duhaime, Tina Kapur, Andrew J Goldsmith
المصدر: Journal of Medical Internet Research, Vol 26, p e51397 (2024)
بيانات النشر: JMIR Publications, 2024.
سنة النشر: 2024
المجموعة: LCC:Computer applications to medicine. Medical informatics
LCC:Public aspects of medicine
مصطلحات موضوعية: Computer applications to medicine. Medical informatics, R858-859.7, Public aspects of medicine, RA1-1270
الوصف: BackgroundMachine learning (ML) models can yield faster and more accurate medical diagnoses; however, developing ML models is limited by a lack of high-quality labeled training data. Crowdsourced labeling is a potential solution but can be constrained by concerns about label quality. ObjectiveThis study aims to examine whether a gamified crowdsourcing platform with continuous performance assessment, user feedback, and performance-based incentives could produce expert-quality labels on medical imaging data. MethodsIn this diagnostic comparison study, 2384 lung ultrasound clips were retrospectively collected from 203 emergency department patients. A total of 6 lung ultrasound experts classified 393 of these clips as having no B-lines, one or more discrete B-lines, or confluent B-lines to create 2 sets of reference standard data sets (195 training clips and 198 test clips). Sets were respectively used to (1) train users on a gamified crowdsourcing platform and (2) compare the concordance of the resulting crowd labels to the concordance of individual experts to reference standards. Crowd opinions were sourced from DiagnosUs (Centaur Labs) iOS app users over 8 days, filtered based on past performance, aggregated using majority rule, and analyzed for label concordance compared with a hold-out test set of expert-labeled clips. The primary outcome was comparing the labeling concordance of collated crowd opinions to trained experts in classifying B-lines on lung ultrasound clips. ResultsOur clinical data set included patients with a mean age of 60.0 (SD 19.0) years; 105 (51.7%) patients were female and 114 (56.1%) patients were White. Over the 195 training clips, the expert-consensus label distribution was 114 (58%) no B-lines, 56 (29%) discrete B-lines, and 25 (13%) confluent B-lines. Over the 198 test clips, expert-consensus label distribution was 138 (70%) no B-lines, 36 (18%) discrete B-lines, and 24 (12%) confluent B-lines. In total, 99,238 opinions were collected from 426 unique users. On a test set of 198 clips, the mean labeling concordance of individual experts relative to the reference standard was 85.0% (SE 2.0), compared with 87.9% crowdsourced label concordance (P=.15). When individual experts’ opinions were compared with reference standard labels created by majority vote excluding their own opinion, crowd concordance was higher than the mean concordance of individual experts to reference standards (87.4% vs 80.8%, SE 1.6 for expert concordance; P
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1438-8871
Relation: https://www.jmir.org/2024/1/e51397; https://doaj.org/toc/1438-8871
DOI: 10.2196/51397
URL الوصول: https://doaj.org/article/f65d391804b246618693357f9d91da2c
رقم الأكسشن: edsdoj.f65d391804b246618693357f9d91da2c
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:14388871
DOI:10.2196/51397