دورية أكاديمية

Merlin: A Vision Language Foundation Model for 3D Computed Tomography.

التفاصيل البيبلوغرافية
العنوان: Merlin: A Vision Language Foundation Model for 3D Computed Tomography.
المؤلفون: Blankemeier L; Department of Electrical Engineering, Stanford University.; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Cohen JP; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University., Kumar A; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Van Veen D; Department of Electrical Engineering, Stanford University.; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Gardezi SJS; Department of Radiology, University of Wisconsin-Madison., Paschali M; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Chen Z; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Delbrouck JB; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Reis E; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Truyts C; Department of Radiology, Hospital Israelita Albert Einstein., Bluethgen C; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, University Hospital Zurich., Jensen MEK; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Ostmeier S; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Varma M; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University.; Department of Computer Science, Stanford University., Valanarasu JMJ; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University.; Department of Computer Science, Stanford University., Fang Z; Department of Radiology, Stanford University., Huo Z; Department of Biomedical Data Science, Stanford University., Nabulsi Z; Department of Electrical Engineering, Stanford University.; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University.; Department of Radiology, University of Wisconsin-Madison.; Department of Radiology, Hospital Israelita Albert Einstein.; Department of Radiology, University Hospital Zurich.; Department of Computer Science, Stanford University.; Department of Biomedical Data Science, Stanford University.; Department of Medicine, Stanford University., Ardila D; Department of Electrical Engineering, Stanford University.; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University.; Department of Radiology, University of Wisconsin-Madison.; Department of Radiology, Hospital Israelita Albert Einstein.; Department of Radiology, University Hospital Zurich.; Department of Computer Science, Stanford University.; Department of Biomedical Data Science, Stanford University.; Department of Medicine, Stanford University., Weng WH; Department of Electrical Engineering, Stanford University.; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University.; Department of Radiology, University of Wisconsin-Madison.; Department of Radiology, Hospital Israelita Albert Einstein.; Department of Radiology, University Hospital Zurich.; Department of Computer Science, Stanford University.; Department of Biomedical Data Science, Stanford University.; Department of Medicine, Stanford University., Amaro E; Department of Radiology, Hospital Israelita Albert Einstein., Ahuja N; Department of Medicine, Stanford University., Fries J; Department of Computer Science, Stanford University.; Department of Biomedical Data Science, Stanford University., Shah NH; Department of Radiology, Stanford University.; Department of Biomedical Data Science, Stanford University., Johnston A; Department of Radiology, Stanford University., Boutin RD; Department of Radiology, Stanford University., Wentland A; Department of Radiology, University of Wisconsin-Madison., Langlotz CP; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University., Hom J; Department of Medicine, Stanford University., Gatidis S; Department of Radiology, Hospital Israelita Albert Einstein., Chaudhari AS; Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University.; Department of Radiology, Stanford University.; Department of Biomedical Data Science, Stanford University.
المصدر: Research square [Res Sq] 2024 Jun 28. Date of Electronic Publication: 2024 Jun 28.
نوع المنشور: Journal Article; Preprint
اللغة: English
بيانات الدورية: Country of Publication: United States NLM ID: 101768035 Publication Model: Electronic Cited Medium: Internet ISSN: 2693-5015 (Electronic) Linking ISSN: 26935015 NLM ISO Abbreviation: Res Sq Subsets: PubMed not MEDLINE
مستخلص: Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current shortage of both general and specialized radiologists, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies while simultaneously using the images to extract novel physiological insights. Prior state-of-the-art approaches for automated medical image interpretation leverage vision language models (VLMs) that utilize both the image and the corresponding textual radiology reports. However, current medical VLMs are generally limited to 2D images and short reports. To overcome these shortcomings for abdominal CT interpretation, we introduce Merlin - a 3D VLM that leverages both structured electronic health records (EHR) and unstructured radiology reports for pretraining without requiring additional manual annotations. We train Merlin using a high-quality clinical dataset of paired CT scans (6+ million images from 15,331 CTs), EHR diagnosis codes (1.8+ million codes), and radiology reports (6+ million tokens) for training. We comprehensively evaluate Merlin on 6 task types and 752 individual tasks. The non-adapted (off-the-shelf) tasks include zero-shot findings classification (31 findings), phenotype classification (692 phenotypes), and zero-shot cross-modal retrieval (image to findings and image to impressions), while model adapted tasks include 5-year chronic disease prediction (6 diseases), radiology report generation, and 3D semantic segmentation (20 organs). We perform internal validation on a test set of 5,137 CTs, and external validation on 7,000 clinical CTs and on two public CT datasets (VerSe, TotalSegmentator). Beyond these clinically-relevant evaluations, we assess the efficacy of various network architectures and training strategies to depict that Merlin has favorable performance to existing task-specific baselines. We derive data scaling laws to empirically assess training data needs for requisite downstream task performance. Furthermore, unlike conventional VLMs that require hundreds of GPUs for training, we perform all training on a single GPU. This computationally efficient design can help democratize foundation model training, especially for health systems with compute constraints. We plan to release our trained models, code, and dataset, pending manual removal of all protected health information.
References: Radiol Artif Intell. 2020 Jul 29;2(4):e190138. (PMID: 33937831)
J Digit Imaging. 2013 Dec;26(6):1045-57. (PMID: 23884657)
Nat Med. 2023 Dec;29(12):3033-3043. (PMID: 37985692)
Acad Radiol. 2022 Jan;29(1):137-143. (PMID: 33158699)
Radiology. 2020 Jun;295(3):675-682. (PMID: 32208097)
NPJ Digit Med. 2021 Jun 1;4(1):88. (PMID: 34075194)
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022:3876-3887. (PMID: 39144675)
AJR Am J Roentgenol. 2022 Oct;219(4):671-680. (PMID: 35642760)
Proc Natl Acad Sci U S A. 2017 Mar 28;114(13):3521-3526. (PMID: 28292907)
Nat Med. 2024 Apr;30(4):1134-1142. (PMID: 38413730)
NPJ Digit Med. 2023 Apr 26;6(1):74. (PMID: 37100953)
Nat Biotechnol. 2013 Dec;31(12):1102-10. (PMID: 24270849)
Radiology. 2017 Jul;284(1):120-133. (PMID: 28221093)
Indian J Med Res. 2018 Nov;148(5):648-658. (PMID: 30666990)
Radiol Artif Intell. 2023 Jul 05;5(5):e230024. (PMID: 37795137)
Sci Data. 2023 Jan 3;10(1):1. (PMID: 36596836)
Radiology. 2023 Oct;309(1):e231114. (PMID: 37874234)
Science. 2023 Mar 3;379(6635):884-886. (PMID: 36862769)
Nat Med. 2019 Jun;25(6):954-961. (PMID: 31110349)
Clin Nutr. 2012 Aug;31(4):435-47. (PMID: 22296871)
West J Emerg Med. 2017 Aug;18(5):835-845. (PMID: 28874935)
Sci Data. 2021 Oct 28;8(1):284. (PMID: 34711848)
Healthcare (Basel). 2021 Nov 16;9(11):. (PMID: 34828603)
Radiology. 2020 Nov;297(2):374-379. (PMID: 32808887)
J Digit Imaging. 2022 Apr;35(2):87-97. (PMID: 35013824)
Abdom Radiol (NY). 2023 Feb;48(2):787-795. (PMID: 36369528)
Radiology. 2023 Feb;306(2):e220574. (PMID: 36165792)
Invest Radiol. 2020 Sep;55(9):592-597. (PMID: 32701620)
NPJ Digit Med. 2020 Apr 24;3:61. (PMID: 32352039)
Sci Rep. 2023 Nov 29;13(1):21034. (PMID: 38030716)
Eur Radiol. 2024 Apr 29;:. (PMID: 38683384)
Radiology. 2013 Apr;267(1):240-50. (PMID: 23329657)
BMJ. 2017 Oct 11;359:j4683. (PMID: 29021184)
CMAJ. 2021 Sep 7;193(35):E1391-E1394. (PMID: 34462316)
Nat Med. 2024 May;30(5):1481-1488. (PMID: 38689062)
J Bone Miner Res. 2014 Nov;29(11):2520-6. (PMID: 24771492)
Am J Prev Cardiol. 2022 Mar 20;10:100336. (PMID: 35368909)
Nat Med. 2024 May;30(5):1471-1480. (PMID: 38740996)
معلومات مُعتمدة: R01 EB002524 United States EB NIBIB NIH HHS; 75N92020C00021 United States HL NHLBI NIH HHS; R01 AR077604 United States AR NIAMS NIH HHS; P41 EB027060 United States EB NIBIB NIH HHS; 75N92020C00008 United States HL NHLBI NIH HHS; R01 HL167974 United States HL NHLBI NIH HHS; R01 AR079431 United States AR NIAMS NIH HHS
تواريخ الأحداث: Date Created: 20240709 Latest Revision: 20240816
رمز التحديث: 20240816
مُعرف محوري في PubMed: PMC11230513
DOI: 10.21203/rs.3.rs-4546309/v1
PMID: 38978576
قاعدة البيانات: MEDLINE
الوصف
تدمد:2693-5015
DOI:10.21203/rs.3.rs-4546309/v1