دورية أكاديمية

A benchmark of batch-effect correction methods for single-cell RNA sequencing data.

التفاصيل البيبلوغرافية
العنوان: A benchmark of batch-effect correction methods for single-cell RNA sequencing data.
المؤلفون: Tran HTN; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore., Ang KS; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore., Chevrier M; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore., Zhang X; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore., Lee NYS; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore., Goh M; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore., Chen J; Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore. chen_jinmiao@immunol.a-star.edu.sg.
المصدر: Genome biology [Genome Biol] 2020 Jan 16; Vol. 21 (1), pp. 12. Date of Electronic Publication: 2020 Jan 16.
نوع المنشور: Journal Article; Research Support, Non-U.S. Gov't
اللغة: English
بيانات الدورية: Publisher: BioMed Central Ltd Country of Publication: England NLM ID: 100960660 Publication Model: Electronic Cited Medium: Internet ISSN: 1474-760X (Electronic) Linking ISSN: 14747596 NLM ISO Abbreviation: Genome Biol Subsets: MEDLINE
أسماء مطبوعة: Publication: London, UK : BioMed Central Ltd
Original Publication: London : Genome Biology Ltd., c2000-
مواضيع طبية MeSH: RNA-Seq/*methods , Single-Cell Analysis/*methods, Algorithms ; Animals ; Benchmarking ; Big Data ; Humans ; Mice
مستخلص: Background: Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal.
Results: We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression.
Conclusion: Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.
References: Genome Biol. 2017 Sep 12;18(1):174. (PMID: 28899397)
Nat Methods. 2019 Jan;16(1):43-49. (PMID: 30573817)
Cell. 2018 Aug 9;174(4):1015-1030.e16. (PMID: 30096299)
Science. 2018 Apr 13;360(6385):176-182. (PMID: 29545511)
Nat Methods. 2019 Dec;16(12):1289-1296. (PMID: 31740819)
Diabetes. 2016 Oct;65(10):3028-38. (PMID: 27364731)
Nat Methods. 2014 Jul;11(7):740-2. (PMID: 24836921)
Nat Biotechnol. 2019 Jun;37(6):685-691. (PMID: 31061482)
Proc Natl Acad Sci U S A. 2019 May 14;116(20):9775-9784. (PMID: 31028141)
F1000Res. 2016 Aug 31;5:2122. (PMID: 27909575)
Cell Syst. 2016 Oct 26;3(4):385-394.e3. (PMID: 27693023)
Cell. 2015 May 21;161(5):1202-1214. (PMID: 26000488)
Biostatistics. 2007 Jan;8(1):118-27. (PMID: 16632515)
Cell Syst. 2016 Oct 26;3(4):346-360.e4. (PMID: 27667365)
Cell Metab. 2016 Oct 11;24(4):608-615. (PMID: 27667665)
R J. 2016 Aug;8(1):289-317. (PMID: 27818791)
Science. 2017 Apr 21;356(6335):. (PMID: 28428369)
Bioinformatics. 2017 Aug 15;33(16):2539-2546. (PMID: 28419223)
Genome Biol. 2018 Feb 6;19(1):15. (PMID: 29409532)
Nat Biotechnol. 2018 Jun;36(5):411-420. (PMID: 29608179)
Cell. 2016 Aug 25;166(5):1308-1323.e30. (PMID: 27565351)
Cell. 2015 Dec 17;163(7):1663-77. (PMID: 26627738)
Genome Biol. 2020 Jan 16;21(1):12. (PMID: 31948481)
Nat Immunol. 2012 Oct;13(10):1000-9. (PMID: 22902830)
Nat Biotechnol. 2018 Jun;36(5):421-427. (PMID: 29608177)
Nat Commun. 2018 Jan 18;9(1):284. (PMID: 29348443)
Cell. 2018 Feb 22;172(5):1091-1107.e17. (PMID: 29474909)
Cell. 2019 Jun 13;177(7):1888-1902.e21. (PMID: 31178118)
Bioinformatics. 2020 Feb 1;36(3):964-965. (PMID: 31400197)
Nature. 2018 Oct;562(7727):367-372. (PMID: 30283141)
Cell Metab. 2016 Oct 11;24(4):593-607. (PMID: 27667667)
Nat Commun. 2017 Jan 16;8:14049. (PMID: 28091601)
Neural Comput. 2004 Dec;16(12):2639-64. (PMID: 15516276)
Methods. 2003 Dec;31(4):265-73. (PMID: 14597310)
Blood. 2016 Aug 25;128(8):e20-31. (PMID: 27365425)
فهرسة مساهمة: Keywords: Batch correction; Batch effect; Differential gene expression; Integration; Single-cell RNA-seq
تواريخ الأحداث: Date Created: 20200118 Date Completed: 20200609 Latest Revision: 20240327
رمز التحديث: 20240327
مُعرف محوري في PubMed: PMC6964114
DOI: 10.1186/s13059-019-1850-9
PMID: 31948481
قاعدة البيانات: MEDLINE
الوصف
تدمد:1474-760X
DOI:10.1186/s13059-019-1850-9