دورية أكاديمية

Large-scale identification of undiagnosed hepatic steatosis using natural language processingResearch in context

التفاصيل البيبلوغرافية
العنوان: Large-scale identification of undiagnosed hepatic steatosis using natural language processingResearch in context
المؤلفون: Carolin V. Schneider, Tang Li, David Zhang, Anya I. Mezina, Puru Rattan, Helen Huang, Kate Townsend Creasy, Eleonora Scorletti, Inuk Zandvakili, Marijana Vujkovic, Leonida Hehl, Jacob Fiksel, Joseph Park, Kirk Wangensteen, Marjorie Risman, Kyong-Mi Chang, Marina Serper, Rotonya M. Carr, Kai Markus Schneider, Jinbo Chen, Daniel J. Rader
المصدر: EClinicalMedicine, Vol 62, Iss , Pp 102149- (2023)
بيانات النشر: Elsevier, 2023.
سنة النشر: 2023
المجموعة: LCC:Medicine (General)
مصطلحات موضوعية: Liver disease, NAFLD, Biopsy, EHR, Natural language processing, Medicine (General), R5-920
الوصف: Summary: Background: Nonalcoholic fatty liver disease (NAFLD) is a major cause of liver-related morbidity in people with and without diabetes, but it is underdiagnosed, posing challenges for research and clinical management. Here, we determine if natural language processing (NLP) of data in the electronic health record (EHR) could identify undiagnosed patients with hepatic steatosis based on pathology and radiology reports. Methods: A rule-based NLP algorithm was built using a Linguamatics literature text mining tool to search 2.15 million pathology report and 2.7 million imaging reports in the Penn Medicine EHR from November 2014, through December 2020, for evidence of hepatic steatosis. For quality control, two independent physicians manually reviewed randomly chosen biopsy and imaging reports (n = 353, PPV 99.7%). Findings: After exclusion of individuals with other causes of hepatic steatosis, 3007 patients with biopsy-proven NAFLD and 42,083 patients with imaging-proven NAFLD were identified. Interestingly, elevated ALT was not a sensitive predictor of the presence of steatosis, and only half of the biopsied patients with steatosis ever received an ICD diagnosis code for the presence of NAFLD/NASH. There was a robust association for PNPLA3 and TM6SF2 risk alleles and steatosis identified by NLP. We identified 234 disorders that were significantly over- or underrepresented in all subjects with steatosis and identified changes in serum markers (e.g., GGT) associated with presence of steatosis. Interpretation: This study demonstrates clear feasibility of NLP-based approaches to identify patients whose steatosis was indicated in imaging and pathology reports within a large healthcare system and uncovers undercoding of NAFLD in the general population. Identification of patients at risk could link them to improved care and outcomes. Funding: The study was funded by US and German funding sources that did provide financial support only and had no influence or control over the research process.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2589-5370
Relation: http://www.sciencedirect.com/science/article/pii/S2589537023003267; https://doaj.org/toc/2589-5370
DOI: 10.1016/j.eclinm.2023.102149
URL الوصول: https://doaj.org/article/54d545ab8c4f481f8d80186030f7ad3b
رقم الأكسشن: edsdoj.54d545ab8c4f481f8d80186030f7ad3b
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:25895370
DOI:10.1016/j.eclinm.2023.102149