دورية أكاديمية
Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data.
العنوان: | Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data. |
---|---|
المؤلفون: | Chen J; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China., Yin D; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China., Wong HYH; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China., Duan X; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China., Yu KHO; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China., Ho JWK; Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park, Hong Kong SAR, China.; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong SAR, China. |
المصدر: | GigaScience [Gigascience] 2024 Jan 02; Vol. 13. |
نوع المنشور: | Journal Article; Research Support, Non-U.S. Gov't |
اللغة: | English |
بيانات الدورية: | Publisher: Oxford University Press Country of Publication: United States NLM ID: 101596872 Publication Model: Print Cited Medium: Internet ISSN: 2047-217X (Electronic) Linking ISSN: 2047217X NLM ISO Abbreviation: Gigascience Subsets: MEDLINE |
أسماء مطبوعة: | Publication: 2017- : New York : Oxford University Press Original Publication: London : BioMed Central |
مواضيع طبية MeSH: | Carcinoma, Hepatocellular*/genetics , Liver Neoplasms*, Humans ; Benchmarking ; DNA Copy Number Variations ; Hepatitis B virus ; Single-Cell Gene Expression Analysis |
مستخلص: | The rapidly growing collection of public single-cell sequencing data has become a valuable resource for molecular, cellular, and microbial discovery. Previous studies mostly overlooked detecting pathogens in human single-cell sequencing data. Moreover, existing bioinformatics tools lack the scalability to deal with big public data. We introduce Vulture, a scalable cloud-based pipeline that performs microbial calling for single-cell RNA sequencing (scRNA-seq) data, enabling meta-analysis of host-microbial studies from the public domain. In our benchmarking experiments, Vulture is 66% to 88% faster than local tools (PathogenTrack and Venus) and 41% faster than the state-of-the-art cloud-based tool Cumulus, while achieving comparable microbial read identification. In terms of the cost on cloud computing systems, Vulture also shows a cost reduction of 83% ($12 vs. ${\$}$70). We applied Vulture to 2 coronavirus disease 2019, 3 hepatocellular carcinoma (HCC), and 2 gastric cancer human patient cohorts with public sequencing reads data from scRNA-seq experiments and discovered cell type-specific enrichment of severe acute respiratory syndrome coronavirus 2, hepatitis B virus (HBV), and Helicobacter pylori-positive cells, respectively. In the HCC analysis, all cohorts showed hepatocyte-only enrichment of HBV, with cell subtype-associated HBV enrichment based on inferred copy number variations. In summary, Vulture presents a scalable and economical framework to mine unknown host-microbial interactions from large-scale public scRNA-seq data. Vulture is available via an open-source license at https://github.com/holab-hku/Vulture. (© The Author(s) 2024. Published by Oxford University Press GigaScience.) |
References: | Nat Commun. 2021 Jun 17;12(1):3684. (PMID: 34140495) Elife. 2017 Dec 05;6:. (PMID: 29206104) Cell. 2020 Oct 15;183(2):377-394.e21. (PMID: 32976798) Elife. 2019 May 15;8:. (PMID: 31090537) IUBMB Life. 2021 Apr;73(4):659-669. (PMID: 33625758) Genome Biol. 2019 Mar 22;20(1):63. (PMID: 30902100) NPJ Precis Oncol. 2022 Jan 27;6(1):9. (PMID: 35087207) Cell. 2020 Jun 25;181(7):1475-1488.e12. (PMID: 32479746) Restor Neurol Neurosci. 2020;38(4):343-354. (PMID: 32597823) Database (Oxford). 2016 Dec 26;2016:. (PMID: 28025349) Nat Med. 2020 Jun;26(6):842-844. (PMID: 32398875) Clin Microbiol Rev. 2010 Oct;23(4):713-39. (PMID: 20930071) F1000Res. 2020 Sep 1;9:1078. (PMID: 33082935) PLoS Comput Biol. 2022 Oct 27;18(10):e1010636. (PMID: 36301997) Cell Syst. 2018 Jun 27;6(6):679-691.e4. (PMID: 29886109) Nat Methods. 2020 Aug;17(8):793-798. (PMID: 32719530) Front Med. 2022 Apr;16(2):251-262. (PMID: 35192147) Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198. (PMID: 31066453) Nat Methods. 2019 Dec;16(12):1289-1296. (PMID: 31740819) Nucleic Acids Res. 2003 Jan 1;31(1):51-4. (PMID: 12519945) Genome Biol. 2019 Mar 27;20(1):65. (PMID: 30917859) Nat Commun. 2021 Feb 17;12(1):1088. (PMID: 33597522) Science. 2014 Jun 20;344(6190):1396-401. (PMID: 24925914) Proc Jpn Acad Ser B Phys Biol Sci. 2017;93(4):196-219. (PMID: 28413197) J Hepatol. 2016 Apr;64(1 Suppl):S84-S101. (PMID: 27084040) Expert Rev Clin Immunol. 2021 May;17(5):431-443. (PMID: 33750254) Nat Commun. 2020 Jan 15;11(1):291. (PMID: 31941899) Nature. 2022 Feb;602(7895):142-147. (PMID: 35082445) Gigascience. 2024 Jan 2;13:. (PMID: 38195165) J Cell Physiol. 2007 Jun;211(3):699-707. (PMID: 17323377) mBio. 2018 Mar 13;9(2):. (PMID: 29535194) Nat Biotechnol. 2021 Jul;39(7):813-818. (PMID: 33795888) Nat Biotechnol. 2020 Aug;38(8):970-979. (PMID: 32591762) Cytokine Growth Factor Rev. 2020 Jun;53:25-32. (PMID: 32446778) Bioinformatics. 2017 Mar 1;33(5):767-769. (PMID: 28025200) Nature. 2021 Jul;595(7865):107-113. (PMID: 33915569) Diagnostics (Basel). 2021 Feb 17;11(2):. (PMID: 33671433) Bioinformatics. 2018 Sep 15;34(18):3094-3100. (PMID: 29750242) Cell Rep. 2019 May 7;27(6):1934-1947.e5. (PMID: 31067475) Nat Biotechnol. 2022 Jan;40(1):30-41. (PMID: 34931002) Nat Commun. 2017 Jan 16;8:14049. (PMID: 28091601) Proc Natl Acad Sci U S A. 2018 Dec 26;115(52):E12363-E12369. (PMID: 30530648) Nat Biotechnol. 2015 May;33(5):495-502. (PMID: 25867923) |
معلومات مُعتمدة: | Innovation and Technology Commission - Hong Kong |
فهرسة مساهمة: | Keywords: COVID-19; HCC; cloud computing; single cell; virus |
تواريخ الأحداث: | Date Created: 20240109 Date Completed: 20240124 Latest Revision: 20240724 |
رمز التحديث: | 20240726 |
مُعرف محوري في PubMed: | PMC10776309 |
DOI: | 10.1093/gigascience/giad117 |
PMID: | 38195165 |
قاعدة البيانات: | MEDLINE |
تدمد: | 2047-217X |
---|---|
DOI: | 10.1093/gigascience/giad117 |