دورية أكاديمية

Exploring the Intersection of Artificial Intelligence and Neurosurgery: Let us be Cautious With ChatGPT.

التفاصيل البيبلوغرافية
العنوان: Exploring the Intersection of Artificial Intelligence and Neurosurgery: Let us be Cautious With ChatGPT.
المؤلفون: Mishra A; Department of Neurological Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Lake Success , New York , USA., Begley SL, Chen A, Rob M, Pelcher I, Ward M, Schulder M
المصدر: Neurosurgery [Neurosurgery] 2023 Dec 01; Vol. 93 (6), pp. 1366-1373. Date of Electronic Publication: 2023 Jul 07.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Lippincott Williams & Wilkins, Inc Country of Publication: United States NLM ID: 7802914 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1524-4040 (Electronic) Linking ISSN: 0148396X NLM ISO Abbreviation: Neurosurgery Subsets: MEDLINE
أسماء مطبوعة: Publication: 2022- : [Philadelphia] : Lippincott Williams & Wilkins, Inc.
Original Publication: Baltimore, Williams & Wilkins.
مواضيع طبية MeSH: Neurosurgery*, Humans ; Artificial Intelligence ; Neurosurgical Procedures ; Neurosurgeons ; Algorithms
مستخلص: Background and Objectives: ChatGPT is a novel natural language processing artificial intelligence (AI) module where users enter any question or command and receive a single text response within seconds. As AI becomes more accessible, patients may begin to use it as a resource for medical information and advice. This is the first study to assess the neurosurgical information that is provided by ChatGPT.
Methods: ChatGPT was accessed in January 2023, and prompts were created requesting treatment information for 40 common neurosurgical conditions. Quantitative characteristics were collected, and four independent reviewers evaluated the responses using the DISCERN tool. Prompts were compared against the American Association of Neurological Surgeons (AANS) "For Patients" webpages.
Results: ChatGPT returned text organized in paragraph and bullet-point lists. ChatGPT responses were shorter (mean 270.1 ± 41.9 words; AANS webpage 1634.5 ± 891.3 words) but more difficult to read (mean Flesch-Kincaid score 32.4 ± 6.7; AANS webpage 37.1 ± 7.0). ChatGPT output was found to be of "fair" quality (mean DISCERN score 44.2 ± 4.1) and significantly inferior to the "good" overall quality of the AANS patient website (57.7 ± 4.4). ChatGPT was poor in providing references/resources and describing treatment risks. ChatGPT provided 177 references, of which 68.9% were inaccurate and 33.9% were completely falsified.
Conclusion: ChatGPT is an adaptive resource for neurosurgical information but has shortcomings that limit the quality of its responses, including poor readability, lack of references, and failure to fully describe treatment options. Hence, patients and providers should remain wary of the provided content. As ChatGPT or other AI search algorithms continue to improve, they may become a reliable alternative for medical information.
(Copyright © Congress of Neurological Surgeons 2023. All rights reserved.)
References: Fox S, Duggan M. Health Online 2013. Pew Research Center; 2013. Accessed February 12, 2023. https://www.pewresearch.org/internet/2013/01/15/health-online-2013/.
Diaz JA, Griffith RA, Ng JJ, Reinert SE, Friedmann PD, Moulton AW. Patients’ use of the internet for medical information. J Gen Intern Med. 2002;17(3):180-185.
Davis TC, Williams MV, Marin E, Parker RM, Glass J. Health literacy and cancer communication. CA Cancer J Clin. 2002;52(3):134-149.
King MR. The future of AI in medicine: a perspective from a chatbot. Ann Biomed Eng. 2023;51(2):291-295.
Jeblick K, Schachtner B, Dexl J, et al. ChatGPT makes medicine easy to swallow: An exploratory case study on simplified radiology reports. arXiv, 2022.
Huh S. Are ChatGPT’s knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination? A descriptive study. J Educ Eval Health Prof. 2023;20:1.
Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digital Health. 2022;2(2): e0000198.
Gilson A, Safranek C, Huang T, et al. How does ChatGPT perform on the medical licensing exams? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ . 2023;9:e45312.
D’Amico RS, White TG, Shah HA, Langer DJ. I asked a ChatGPT to write an editorial about how we can incorporate chatbots into neurosurgical research and patient care…. Neurosurgery. 2023;92(4):663-664.
Oermann EK, Kondziolka D. On chatbots and generative artificial intelligence. Neurosurgery. 2023;92(4):665-666.
Kincaid JP, Fishburne RobertP Jr., Rogers RichardL, Chissom BradS. Derivation of new readability formulas (automated readability index, fog count and Flesch reading Ease formula) for Navy enlisted personnel. Inst Simul Train. 1975;56. https://stars.library.ucf.edu/istlibrary/56.
Charnock D, Shepperd S, Needham G, Gann R. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. J Epidemiol Community Health. 1999;53(2):105-111.
Ward M, Ward B, Abraham M, et al. The educational quality of neurosurgical resources on YouTube. World Neurosurg. 2019;130:e660-e665.
McBriar JD, Mishra A, Shah HA, Boockvar JA, Langer DJ, D’Amico RS. #Neurosurgery: a cross-sectional analysis of neurosurgical content on TikTok. World Neurosurg X. 2023;17:100137.
Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163.
Schilling AT, Shah PP, Feghali J, Jimenez AE, Azad TD. A brief history of machine learning in neurosurgery. In: Staartjes VE, Regli L, Serra C, editors. Machine Learning in Clinical Neuroscience. Acta Neurochirurgica Supplement. Vol 134; Springer International Publishing; 2022:245-250.
Brouillette M. AI added to the curriculum for doctors-to-be. Nat Med. 2019;25(12):1808-1809.
Senders JT, Arnaout O, Karhade AV, et al. Natural and artificial intelligence in neurosurgery: a systematic review. Neurosurgery. 2018;83(2):181-192.
Jia X, Pang Y, Liu LS. Online health information seeking behavior: a systematic review. Healthcare. 2021;9(12):1740.
Feathers A, Yen T, Yun L, Strizich G, Swaminath A. Internet searches about therapies do not impact willingness to accept prescribed therapy in inflammatory bowel disease patients. Dig Dis Sci. 2016;61(4):1013-1020.
Atci IB, Yilmaz H, Kocaman U, Samanci MY. An evaluation of internet use by neurosurgery patients prior to lumbar disc surgery and of information available on internet. Clin Neurol Neurosurg. 2017;158:56-59.
Rao AJ, Dy CJ, Goldfarb CA, Cohen MS, Wysocki RW. Patient preferences and utilization of online resources for patients treated in hand surgery practices. Hand. 2019;14(2):277-283.
Chang ME, Baker SJ, Dos Santos Marques IC, et al. Health literacy in surgery. Health literacy Res Pract. 2020;4(1):e46-e65.
Shlobin NA, Clark JR, Hoffman SC, Hopkins BS, Kesavabhotla K, Dahdaleh NS. Patient education in neurosurgery: part 1 of a systematic review. World Neurosurg. 2021;147:202-214.e1.
Choudhry AJ, Baghdadi YMK, Wagie AE, et al. Readability of discharge summaries: with what level of information are we dismissing our patients? Am J Surg. 2016;211(3):631-636.
Ziegler DM, Stiennon N, Wu J, et al. Fine-tuning language models from human preferences. arXiv, 2019.
Nakano R, Hilton J, Balaji S, et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv, 2021.
Kojima N, Suhr A, Artzi Y. Continual learning for grounded instruction generation by observing human following behavior. Trans Assoc Comput Linguist . 2021;9:1303-1319.
Stiennon N, Ouyang L, Wu J, et al. Learning to summarize with human feedback. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems, Vol 33. Curran Associates, Inc.; 2020:3008-3021. https://proceedings.neurips.cc/paper_files/paper/2020/file/1f89885d556929e98d3ef9b86448f951-Paper.pdf.
ChatGPT OpenAI. Optimizing Language models for dialogue. OpenAI. 2022. Accessed February 12, 2023. https://openai.com/blog/chatgpt/.
Lambert N, Castricato L, von Werra L, Havrilla A. Illustrating Reinforcement Learning from Human Feedback (RLHF). Hugging Face Blog. 2023. Accessed February 12, 2023. https://huggingface.co/blog/rlhf.
Hirosawa T, Harada Y, Yokose M, Sakamoto T, Kawamura R, Shimizu T. Diagnostic accuracy of differential-diagnosis lists generated by generative pretrained transformer 3 chatbot for clinical vignettes with common chief complaints: a pilot study. Int J Environ Res Public Health. 2023;20(4):3378.
تواريخ الأحداث: Date Created: 20230707 Date Completed: 20231116 Latest Revision: 20240222
رمز التحديث: 20240222
DOI: 10.1227/neu.0000000000002598
PMID: 37417886
قاعدة البيانات: MEDLINE
الوصف
تدمد:1524-4040
DOI:10.1227/neu.0000000000002598