التفاصيل البيبلوغرافية
العنوان: |
A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports. |
المؤلفون: |
Pelletier AR; Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; arpelletier@g.ucla.edu., Steinecke D; Department of Physiology, UCLA School of Medicine; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; Medical Informatics, University of California at Los Angeles (UCLA)., Sigdel D; Department of Physiology, UCLA School of Medicine., Adam I; Department of Physiology, UCLA School of Medicine., Caufield JH; Department of Physiology, UCLA School of Medicine., Guevara-Gonzalez V; Department of Physiology, UCLA School of Medicine., Ramirez J; Department of Physiology, UCLA School of Medicine., Verma A; Department of Physiology, UCLA School of Medicine., Bali K; Department of Physiology, UCLA School of Medicine., Downs K; Department of Physiology, UCLA School of Medicine., Wang W; Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA., Bui A; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; Medical Informatics, University of California at Los Angeles (UCLA)., Ping P; Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering; NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program, UCLA; Medical Informatics, University of California at Los Angeles (UCLA); Department of Medicine (Cardiology), UCLA School of Medicine. |
المصدر: |
Journal of visualized experiments : JoVE [J Vis Exp] 2023 Oct 13 (200). Date of Electronic Publication: 2023 Oct 13. |
نوع المنشور: |
Journal Article; Video-Audio Media; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, Non-P.H.S. |
اللغة: |
English |
بيانات الدورية: |
Publisher: MYJoVE Corporation Country of Publication: United States NLM ID: 101313252 Publication Model: Electronic Cited Medium: Internet ISSN: 1940-087X (Electronic) Linking ISSN: 1940087X NLM ISO Abbreviation: J Vis Exp Subsets: MEDLINE |
أسماء مطبوعة: |
Original Publication: [Boston, Mass. : MYJoVE Corporation, 2006]- |
مواضيع طبية MeSH: |
Pattern Recognition, Automated* , Software*, Reproducibility of Results ; Data Mining/methods |
مستخلص: |
The rapidly increasing and vast quantities of biomedical reports, each containing numerous entities and rich information, represent a rich resource for biomedical text-mining applications. These tools enable investigators to integrate, conceptualize, and translate these discoveries to uncover new insights into disease pathology and therapeutics. In this protocol, we present CaseOLAP LIFT, a new computational pipeline to investigate cellular components and their disease associations by extracting user-selected information from text datasets (e.g., biomedical literature). The software identifies sub-cellular proteins and their functional partners within disease-relevant documents. Additional disease-relevant documents are identified via the software's label imputation method. To contextualize the resulting protein-disease associations and to integrate information from multiple relevant biomedical resources, a knowledge graph is automatically constructed for further analyses. We present one use case with a corpus of ~34 million text documents downloaded online to provide an example of elucidating the role of mitochondrial proteins in distinct cardiovascular disease phenotypes using this method. Furthermore, a deep learning model was applied to the resulting knowledge graph to predict previously unreported relationships between proteins and disease, resulting in 1,583 associations with predicted probabilities >0.90 and with an area under the receiver operating characteristic curve (AUROC) of 0.91 on the test set. This software features a highly customizable and automated workflow, with a broad scope of raw data available for analysis; therefore, using this method, protein-disease associations can be identified with enhanced reliability within a text corpus. |
معلومات مُعتمدة: |
R35 HL135772 United States HL NHLBI NIH HHS; T32 EB016640 United States EB NIBIB NIH HHS; R01 HL146739 United States HL NHLBI NIH HHS |
تواريخ الأحداث: |
Date Created: 20231030 Date Completed: 20231031 Latest Revision: 20231101 |
رمز التحديث: |
20231215 |
DOI: |
10.3791/65084 |
PMID: |
37902366 |
قاعدة البيانات: |
MEDLINE |