دورية أكاديمية

Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data.

التفاصيل البيبلوغرافية
العنوان: Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data.
المؤلفون: Chen J; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China., Yin D; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China., Wong HYH; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China., Duan X; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China., Yu KHO; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China., Ho JWK; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
المصدر: GigaScience [Gigascience] 2024 Jan 02; Vol. 13.
نوع المنشور: Journal Article; Research Support, Non-U.S. Gov't
اللغة: English
بيانات الدورية: Publisher: Oxford University Press Country of Publication: United States NLM ID: 101596872 Publication Model: Print Cited Medium: Internet ISSN: 2047-217X (Electronic) Linking ISSN: 2047217X NLM ISO Abbreviation: Gigascience Subsets: MEDLINE
أسماء مطبوعة: Publication: 2017- : New York : Oxford University Press
Original Publication: London : BioMed Central
مواضيع طبية MeSH: Carcinoma, Hepatocellular*/genetics , Liver Neoplasms*, Humans ; Benchmarking ; DNA Copy Number Variations ; Hepatitis B virus ; Single-Cell Gene Expression Analysis
مستخلص: The rapidly growing collection of public single-cell sequencing data has become a valuable resource for molecular, cellular, and microbial discovery. Previous studies mostly overlooked detecting pathogens in human single-cell sequencing data. Moreover, existing bioinformatics tools lack the scalability to deal with big public data. We introduce Vulture, a scalable cloud-based pipeline that performs microbial calling for single-cell RNA sequencing (scRNA-seq) data, enabling meta-analysis of host-microbial studies from the public domain. In our benchmarking experiments, Vulture is 66% to 88% faster than local tools (PathogenTrack and Venus) and 41% faster than the state-of-the-art cloud-based tool Cumulus, while achieving comparable microbial read identification. In terms of the cost on cloud computing systems, Vulture also shows a cost reduction of 83% ($12 vs. ${\$}$70). We applied Vulture to 2 coronavirus disease 2019, 3 hepatocellular carcinoma (HCC), and 2 gastric cancer human patient cohorts with public sequencing reads data from scRNA-seq experiments and discovered cell type-specific enrichment of severe acute respiratory syndrome coronavirus 2, hepatitis B virus (HBV), and Helicobacter pylori-positive cells, respectively. In the HCC analysis, all cohorts showed hepatocyte-only enrichment of HBV, with cell subtype-associated HBV enrichment based on inferred copy number variations. In summary, Vulture presents a scalable and economical framework to mine unknown host-microbial interactions from large-scale public scRNA-seq data. Vulture is available via an open-source license at https://github.com/holab-hku/Vulture.
(© The Author(s) 2024. Published by Oxford University Press GigaScience.)
References: Nat Commun. 2021 Jun 17;12(1):3684. (PMID: 34140495)
Elife. 2017 Dec 05;6:. (PMID: 29206104)
Cell. 2020 Oct 15;183(2):377-394.e21. (PMID: 32976798)
Elife. 2019 May 15;8:. (PMID: 31090537)
IUBMB Life. 2021 Apr;73(4):659-669. (PMID: 33625758)
Genome Biol. 2019 Mar 22;20(1):63. (PMID: 30902100)
NPJ Precis Oncol. 2022 Jan 27;6(1):9. (PMID: 35087207)
Cell. 2020 Jun 25;181(7):1475-1488.e12. (PMID: 32479746)
Restor Neurol Neurosci. 2020;38(4):343-354. (PMID: 32597823)
Database (Oxford). 2016 Dec 26;2016:. (PMID: 28025349)
Nat Med. 2020 Jun;26(6):842-844. (PMID: 32398875)
Clin Microbiol Rev. 2010 Oct;23(4):713-39. (PMID: 20930071)
F1000Res. 2020 Sep 1;9:1078. (PMID: 33082935)
PLoS Comput Biol. 2022 Oct 27;18(10):e1010636. (PMID: 36301997)
Cell Syst. 2018 Jun 27;6(6):679-691.e4. (PMID: 29886109)
Nat Methods. 2020 Aug;17(8):793-798. (PMID: 32719530)
Front Med. 2022 Apr;16(2):251-262. (PMID: 35192147)
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198. (PMID: 31066453)
Nat Methods. 2019 Dec;16(12):1289-1296. (PMID: 31740819)
Nucleic Acids Res. 2003 Jan 1;31(1):51-4. (PMID: 12519945)
Genome Biol. 2019 Mar 27;20(1):65. (PMID: 30917859)
Nat Commun. 2021 Feb 17;12(1):1088. (PMID: 33597522)
Science. 2014 Jun 20;344(6190):1396-401. (PMID: 24925914)
Proc Jpn Acad Ser B Phys Biol Sci. 2017;93(4):196-219. (PMID: 28413197)
J Hepatol. 2016 Apr;64(1 Suppl):S84-S101. (PMID: 27084040)
Expert Rev Clin Immunol. 2021 May;17(5):431-443. (PMID: 33750254)
Nat Commun. 2020 Jan 15;11(1):291. (PMID: 31941899)
Nature. 2022 Feb;602(7895):142-147. (PMID: 35082445)
Gigascience. 2024 Jan 2;13:. (PMID: 38195165)
J Cell Physiol. 2007 Jun;211(3):699-707. (PMID: 17323377)
mBio. 2018 Mar 13;9(2):. (PMID: 29535194)
Nat Biotechnol. 2021 Jul;39(7):813-818. (PMID: 33795888)
Nat Biotechnol. 2020 Aug;38(8):970-979. (PMID: 32591762)
Cytokine Growth Factor Rev. 2020 Jun;53:25-32. (PMID: 32446778)
Bioinformatics. 2017 Mar 1;33(5):767-769. (PMID: 28025200)
Nature. 2021 Jul;595(7865):107-113. (PMID: 33915569)
Diagnostics (Basel). 2021 Feb 17;11(2):. (PMID: 33671433)
Bioinformatics. 2018 Sep 15;34(18):3094-3100. (PMID: 29750242)
Cell Rep. 2019 May 7;27(6):1934-1947.e5. (PMID: 31067475)
Nat Biotechnol. 2022 Jan;40(1):30-41. (PMID: 34931002)
Nat Commun. 2017 Jan 16;8:14049. (PMID: 28091601)
Proc Natl Acad Sci U S A. 2018 Dec 26;115(52):E12363-E12369. (PMID: 30530648)
Nat Biotechnol. 2015 May;33(5):495-502. (PMID: 25867923)
معلومات مُعتمدة: Innovation and Technology Commission - Hong Kong
فهرسة مساهمة: Keywords: COVID-19; HCC; cloud computing; single cell; virus
تواريخ الأحداث: Date Created: 20240109 Date Completed: 20240124 Latest Revision: 20240724
رمز التحديث: 20240726
مُعرف محوري في PubMed: PMC10776309
DOI: 10.1093/gigascience/giad117
PMID: 38195165
قاعدة البيانات: MEDLINE
الوصف
تدمد:2047-217X
DOI:10.1093/gigascience/giad117