دورية أكاديمية

Impact of Confounding Thoracic Tubes and Pleural Dehiscence Extent on Artificial Intelligence Pneumothorax Detection in Chest Radiographs.

التفاصيل البيبلوغرافية
العنوان: Impact of Confounding Thoracic Tubes and Pleural Dehiscence Extent on Artificial Intelligence Pneumothorax Detection in Chest Radiographs.
المؤلفون: Rueckel J; From the Department of Radiology, University Hospital, LMU Munich., Trappmann L; From the Department of Radiology, University Hospital, LMU Munich., Schachtner B, Wesp P; From the Department of Radiology, University Hospital, LMU Munich., Hoppe BF; From the Department of Radiology, University Hospital, LMU Munich., Fink N; From the Department of Radiology, University Hospital, LMU Munich., Ricke J; From the Department of Radiology, University Hospital, LMU Munich., Dinkel J, Ingrisch M; From the Department of Radiology, University Hospital, LMU Munich., Sabel BO; From the Department of Radiology, University Hospital, LMU Munich.
المصدر: Investigative radiology [Invest Radiol] 2020 Dec; Vol. 55 (12), pp. 792-798.
نوع المنشور: Journal Article; Research Support, Non-U.S. Gov't
اللغة: English
بيانات الدورية: Publisher: Lippincott Williams & Wilkins Country of Publication: United States NLM ID: 0045377 Publication Model: Print Cited Medium: Internet ISSN: 1536-0210 (Electronic) Linking ISSN: 00209996 NLM ISO Abbreviation: Invest Radiol Subsets: MEDLINE
أسماء مطبوعة: Publication: 1998- : Hagerstown, MD : Lippincott Williams & Wilkins
Original Publication: Philadelphia.
مواضيع طبية MeSH: Artificial Intelligence* , Radiography, Thoracic*, Image Processing, Computer-Assisted/*methods , Pleural Cavity/*diagnostic imaging , Pneumothorax/*diagnostic imaging, Case-Control Studies ; Cohort Studies ; Female ; Humans ; ROC Curve ; Retrospective Studies
مستخلص: Objectives: We hypothesized that published performances of algorithms for artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXRs) do not sufficiently consider the influence of PTX size and confounding effects caused by thoracic tubes (TTs). Therefore, we established a radiologically annotated benchmarking cohort (n = 6446) allowing for a detailed subgroup analysis.
Materials and Methods: We retrospectively identified 6434 supine CXRs, among them 1652 PTX-positive cases and 4782 PTX-negative cases. Supine CXRs were radiologically annotated for PTX size, PTX location, and inserted TTs. The diagnostic performances of 2 AI algorithms ("AI_CheXNet" [Rajpurkar et al], "AI_1.5" [Guendel et al]), both trained on publicly available datasets with labels obtained from automatic report interpretation, were quantified. The algorithms' discriminative power for PTX detection was quantified by the area under the receiver operating characteristics (AUROC), and significance analysis was based on the corresponding 95% confidence interval. A detailed subgroup analysis was performed to quantify the influence of PTX size and the confounding effects caused by inserted TTs.
Results: Algorithm performance was quantified as follows: overall performance with AUROCs of 0.704 (AI_1.5) / 0.765 (AI_CheXNet) for unilateral PTXs, AUROCs of 0.666 (AI_1.5) / 0.722 (AI_CheXNet) for unilateral PTXs smaller than 1 cm, and AUROCs of 0.735 (AI_1.5) / 0.818 (AI_CheXNet) for unilateral PTXs larger than 2 cm. Subgroup analysis identified TTs to be strong confounders that significantly influence algorithm performance: Discriminative power is completely eliminated by analyzing PTX-positive cases without TTs referenced to control PTX-negative cases with inserted TTs. Contrarily, AUROCs increased up to 0.875 (AI_CheXNet) for large PTX-positive cases with inserted TTs referenced to control cases without TTs.
Conclusions: Our detailed subgroup analysis demonstrated that the performance of established AI algorithms for PTX detection trained on public datasets strongly depends on PTX size and is significantly biased by confounding image features, such as inserted TTS. Our established, clinically relevant and radiologically annotated benchmarking cohort might be of great benefit for ongoing algorithm development.
References: Raoof S, Feigin D, Sung A, et al. Interpretation of plain chest roentgenogram. Chest. 2012;141:545–558.
Kallianos K, Mongan J, Antani S, et al. How far have we come? Artificial intelligence for chest radiograph interpretation. Clin Radiol. 2019;74:338–345.
CheXNet: radiologist-level pneumonia detection on chest X-rays with deep learning. Cited October 29, 2018. https://stanfordmlgroup.github.io/projects/chexnet/. Accessed March 5, 2020.
Rueckel J, Kunz WG, Hoppe BF, et al. Artificial intelligence algorithm detecting lung infection in supine chest radiographs of critically ill patients with a diagnostic accuracy similar to board-certified radiologists. Crit Care Med. 2020;48:e574–e583.
Rajpurkar P, Irvin J, Ball RL, et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018;15:e1002686.
Taylor AG, Mielke C, Mongan J. Automated detection of moderate and large pneumothorax on frontal chest X-rays using deep convolutional neural networks: a retrospective study. PLoS Med. 2018;15:e1002697.
Park S, Lee SM, Kim N, et al. Application of deep learning-based computer-aided detection system: detecting pneumothorax on chest radiograph after biopsy. Eur Radiol. 2019;29:5341–5348.
Wang X, Peng Y, Lu L, et al. ChestX-ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: 2017 I.E. Conference on Computer Vision and Pattern Recognition (CVPR). 2017:3462–3471.
Yao L, Poblenz E, Dagunts D, et al. Learning to diagnose from scratch by exploiting dependencies among labels. arXiv:171010501. 2017. Cited September 2, 2019. http://arxiv.org/abs/1710.10501. Accessed March 5, 2020.
Weng X. arnoweng/CheXNet. 2020. Cited May 1, 2020. https://github.com/arnoweng/CheXNet. Accessed March 5, 2020.
Gohagan JK, Prorok PC, Hayes RB, et al. The Prostate, Lung, Colorectal and Ovarian (PLCO) cancer screening trial of the National Cancer Institute: history, organization, and status. Control Clin Trials. 2000;21:251S–272S.
Guendel S, Grbic S, Georgescu B, et al. Learning to recognize abnormalities in chest X-rays with location-aware dense networks. arXiv:180304565 [cs]. 2018. Cited February 19, 2020. http://arxiv.org/abs/1803.04565. Accessed March 5, 2020.
Guendel S, Ghesu FC, Grbic S, et al. Multi-task learning for chest X-ray abnormality classification on noisy labels. arXiv:190506362 [cs]. 2019. Cited October 3, 2019. http://arxiv.org/abs/1905.06362. Accessed March 5, 2020.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845.
Sun X, Xu W. Fast implementation of DeLong's algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process Lett. 2014;21:1389–1393.
Thelle A, Gjerdevik M, Grydeland T, et al. Pneumothorax size measurements on digital chest radiographs: intra- and inter-rater reliability. Eur J Radiol. 2015;84:2038–2043.
تواريخ الأحداث: Date Created: 20200723 Date Completed: 20210519 Latest Revision: 20210519
رمز التحديث: 20221213
DOI: 10.1097/RLI.0000000000000707
PMID: 32694453
قاعدة البيانات: MEDLINE
الوصف
تدمد:1536-0210
DOI:10.1097/RLI.0000000000000707