دورية أكاديمية

APPRAISE-AI Tool for Quantitative Evaluation of AI Studies for Clinical Decision Support.

التفاصيل البيبلوغرافية
العنوان: APPRAISE-AI Tool for Quantitative Evaluation of AI Studies for Clinical Decision Support.
المؤلفون: Kwong JCC; Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada.; Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada., Khondker A; Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada., Lajkosz K; Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada.; Department of Biostatistics, University Health Network, University of Toronto, Toronto, Ontario, Canada., McDermott MBA; Department of Biomedical Informatics, Massachusetts Institute of Technology, Cambridge., Frigola XB; Laboratory for Computational Physiology, Harvard-Massachusetts Institute of Technology Division of Health Sciences and Technology, Cambridge.; Anesthesiology and Critical Care Department, Hospital Clinic de Barcelona, Barcelona, Spain., McCradden MD; Department of Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada.; Genetics & Genome Biology Research Program, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada.; Division of Clinical and Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada., Mamdani M; Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada.; Data Science and Advanced Analytics, Unity Health Toronto, Toronto, Ontario, Canada., Kulkarni GS; Division of Urology, Department of Surgery, University of Toronto, Toronto, Ontario, Canada.; Princess Margaret Cancer Centre, University Health Network, University of Toronto, Toronto, Ontario, Canada., Johnson AEW; Temerty Centre for AI Research and Education in Medicine, University of Toronto, Toronto, Ontario, Canada.; Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.; Child Health Evaluative Sciences, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada.
المصدر: JAMA network open [JAMA Netw Open] 2023 Sep 05; Vol. 6 (9), pp. e2335377. Date of Electronic Publication: 2023 Sep 05.
نوع المنشور: Systematic Review; Journal Article; Research Support, Non-U.S. Gov't
اللغة: English
بيانات الدورية: Publisher: American Medical Association Country of Publication: United States NLM ID: 101729235 Publication Model: Electronic Cited Medium: Internet ISSN: 2574-3805 (Electronic) Linking ISSN: 25743805 NLM ISO Abbreviation: JAMA Netw Open Subsets: MEDLINE
أسماء مطبوعة: Original Publication: Chicago, IL : American Medical Association, [2018]-
مواضيع طبية MeSH: Artificial Intelligence* , Decision Support Systems, Clinical*, Humans ; Reproducibility of Results ; Machine Learning ; Clinical Relevance
مستخلص: Importance: Artificial intelligence (AI) has gained considerable attention in health care, yet concerns have been raised around appropriate methods and fairness. Current AI reporting guidelines do not provide a means of quantifying overall quality of AI research, limiting their ability to compare models addressing the same clinical question.
Objective: To develop a tool (APPRAISE-AI) to evaluate the methodological and reporting quality of AI prediction models for clinical decision support.
Design, Setting, and Participants: This quality improvement study evaluated AI studies in the model development, silent, and clinical trial phases using the APPRAISE-AI tool, a quantitative method for evaluating quality of AI studies across 6 domains: clinical relevance, data quality, methodological conduct, robustness of results, reporting quality, and reproducibility. These domains included 24 items with a maximum overall score of 100 points. Points were assigned to each item, with higher points indicating stronger methodological or reporting quality. The tool was applied to a systematic review on machine learning to estimate sepsis that included articles published until September 13, 2019. Data analysis was performed from September to December 2022.
Main Outcomes and Measures: The primary outcomes were interrater and intrarater reliability and the correlation between APPRAISE-AI scores and expert scores, 3-year citation rate, number of Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) low risk-of-bias domains, and overall adherence to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement.
Results: A total of 28 studies were included. Overall APPRAISE-AI scores ranged from 33 (low quality) to 67 (high quality). Most studies were moderate quality. The 5 lowest scoring items included source of data, sample size calculation, bias assessment, error analysis, and transparency. Overall APPRAISE-AI scores were associated with expert scores (Spearman ρ, 0.82; 95% CI, 0.64-0.91; P < .001), 3-year citation rate (Spearman ρ, 0.69; 95% CI, 0.43-0.85; P < .001), number of QUADAS-2 low risk-of-bias domains (Spearman ρ, 0.56; 95% CI, 0.24-0.77; P = .002), and adherence to the TRIPOD statement (Spearman ρ, 0.87; 95% CI, 0.73-0.94; P < .001). Intraclass correlation coefficient ranges for interrater and intrarater reliability were 0.74 to 1.00 for individual items, 0.81 to 0.99 for individual domains, and 0.91 to 0.98 for overall scores.
Conclusions and Relevance: In this quality improvement study, APPRAISE-AI demonstrated strong interrater and intrarater reliability and correlated well with several study quality measures. This tool may provide a quantitative approach for investigators, reviewers, editors, and funding organizations to compare the research quality across AI studies for clinical decision support.
References: Nat Med. 2022 May;28(5):924-933. (PMID: 35585198)
Nat Mach Intell. 2019 May;1(5):206-215. (PMID: 35603010)
BMJ Open Qual. 2017 Oct 25;6(2):e000158. (PMID: 29450295)
JAMA. 2020 Jan 28;323(4):305-306. (PMID: 31904799)
Med Decis Making. 2006 Nov-Dec;26(6):565-74. (PMID: 17099194)
JACC Cardiovasc Imaging. 2020 Sep;13(9):2017-2035. (PMID: 32912474)
Eur Urol Focus. 2021 Jul;7(4):672-682. (PMID: 34362709)
Stat Med. 2000 Feb 29;19(4):453-73. (PMID: 10694730)
BMC Emerg Med. 2016 Aug 22;16(1):31. (PMID: 27549755)
J Surg Res. 2016 Feb;200(2):676-82. (PMID: 26515734)
JAMA Intern Med. 2021 Aug 1;181(8):1065-1070. (PMID: 34152373)
Lancet Digit Health. 2022 May;4(5):e384-e397. (PMID: 35396183)
Lancet Digit Health. 2021 Nov;3(11):e693-e695. (PMID: 34561202)
BMC Med. 2023 Feb 24;21(1):70. (PMID: 36829188)
Lancet Digit Health. 2020 Oct;2(10):e549-e560. (PMID: 33328049)
BMJ. 2015 Jan 07;350:g7594. (PMID: 25569120)
BMJ Open. 2021 Jul 9;11(7):e048008. (PMID: 34244270)
J Chiropr Med. 2016 Jun;15(2):155-63. (PMID: 27330520)
Ann Intern Med. 2011 Oct 18;155(8):529-36. (PMID: 22007046)
Can Assoc Radiol J. 2019 Nov;70(4):344-353. (PMID: 31522841)
Ophthalmology. 2019 Nov;126(11):1475-1479. (PMID: 31635697)
BMC Med Res Methodol. 2022 Apr 8;22(1):101. (PMID: 35395724)
Lancet Digit Health. 2021 Nov;3(11):e745-e750. (PMID: 34711379)
BMC Med Res Methodol. 2014 Dec 22;14:137. (PMID: 25532820)
Intensive Care Med. 2020 Mar;46(3):383-400. (PMID: 31965266)
Nat Med. 2020 Sep;26(9):1364-1374. (PMID: 32908283)
J Am Med Inform Assoc. 2022 Aug 16;29(9):1525-1534. (PMID: 35686364)
Nat Med. 2020 Sep;26(9):1320-1324. (PMID: 32908275)
N Engl J Med. 2021 Jul 15;385(3):283-286. (PMID: 34260843)
J Hosp Med. 2010 Jan;5(1):19-25. (PMID: 20063402)
BMJ Open Respir Res. 2017 Nov 09;4(1):e000234. (PMID: 29435343)
Nat Rev Clin Oncol. 2017 Dec;14(12):749-762. (PMID: 28975929)
BMJ Open. 2021 Jun 28;11(6):e047709. (PMID: 34183345)
Nat Med. 2019 Sep;25(9):1337-1340. (PMID: 31427808)
تواريخ الأحداث: Date Created: 20230925 Date Completed: 20230926 Latest Revision: 20240724
رمز التحديث: 20240725
مُعرف محوري في PubMed: PMC10520738
DOI: 10.1001/jamanetworkopen.2023.35377
PMID: 37747733
قاعدة البيانات: MEDLINE