دورية أكاديمية

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports.

التفاصيل البيبلوغرافية
العنوان: A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports.
المؤلفون: Pelletier AR; Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; arpelletier@g.ucla.edu., Steinecke D; Department of Physiology, UCLA School of Medicine; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; Medical Informatics, University of California at Los Angeles (UCLA)., Sigdel D; Department of Physiology, UCLA School of Medicine., Adam I; Department of Physiology, UCLA School of Medicine., Caufield JH; Department of Physiology, UCLA School of Medicine., Guevara-Gonzalez V; Department of Physiology, UCLA School of Medicine., Ramirez J; Department of Physiology, UCLA School of Medicine., Verma A; Department of Physiology, UCLA School of Medicine., Bali K; Department of Physiology, UCLA School of Medicine., Downs K; Department of Physiology, UCLA School of Medicine., Wang W; Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA., Bui A; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; Medical Informatics, University of California at Los Angeles (UCLA)., Ping P; Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; Medical Informatics, University of California at Los Angeles (UCLA); Department of Medicine (Cardiology), UCLA School of Medicine.
المصدر: Journal of visualized experiments : JoVE [J Vis Exp] 2023 Oct 13 (200). Date of Electronic Publication: 2023 Oct 13.
نوع المنشور: Journal Article; Video-Audio Media; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, Non-P.H.S.
اللغة: English
بيانات الدورية: Publisher: MYJoVE Corporation Country of Publication: United States NLM ID: 101313252 Publication Model: Electronic Cited Medium: Internet ISSN: 1940-087X (Electronic) Linking ISSN: 1940087X NLM ISO Abbreviation: J Vis Exp Subsets: MEDLINE
أسماء مطبوعة: Original Publication: [Boston, Mass. : MYJoVE Corporation, 2006]-
مواضيع طبية MeSH: Pattern Recognition, Automated* , Software*, Reproducibility of Results ; Data Mining/methods
مستخلص: The rapidly increasing and vast quantities of biomedical reports, each containing numerous entities and rich information, represent a rich resource for biomedical text-mining applications. These tools enable investigators to integrate, conceptualize, and translate these discoveries to uncover new insights into disease pathology and therapeutics. In this protocol, we present CaseOLAP LIFT, a new computational pipeline to investigate cellular components and their disease associations by extracting user-selected information from text datasets (e.g., biomedical literature). The software identifies sub-cellular proteins and their functional partners within disease-relevant documents. Additional disease-relevant documents are identified via the software's label imputation method. To contextualize the resulting protein-disease associations and to integrate information from multiple relevant biomedical resources, a knowledge graph is automatically constructed for further analyses. We present one use case with a corpus of ~34 million text documents downloaded online to provide an example of elucidating the role of mitochondrial proteins in distinct cardiovascular disease phenotypes using this method. Furthermore, a deep learning model was applied to the resulting knowledge graph to predict previously unreported relationships between proteins and disease, resulting in 1,583 associations with predicted probabilities >0.90 and with an area under the receiver operating characteristic curve (AUROC) of 0.91 on the test set. This software features a highly customizable and automated workflow, with a broad scope of raw data available for analysis; therefore, using this method, protein-disease associations can be identified with enhanced reliability within a text corpus.
معلومات مُعتمدة: R35 HL135772 United States HL NHLBI NIH HHS; T32 EB016640 United States EB NIBIB NIH HHS; R01 HL146739 United States HL NHLBI NIH HHS
تواريخ الأحداث: Date Created: 20231030 Date Completed: 20231031 Latest Revision: 20231101
رمز التحديث: 20231215
DOI: 10.3791/65084
PMID: 37902366
قاعدة البيانات: MEDLINE
الوصف
تدمد:1940-087X
DOI:10.3791/65084