دورية أكاديمية

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

التفاصيل البيبلوغرافية
العنوان: Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.
المؤلفون: Bokulich NA; The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA. nicholas.bokulich@nau.edu., Kaehler BD; Research School of Biology, Australian National University, 46 Sullivans Creek Road, Acton ACT, 2601, Australia. benjamin.kaehler@anu.edu.au., Rideout JR; The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA., Dillon M; The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA., Bolyen E; The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA., Knight R; Departments of Pediatrics and Computer Science and Engineering, and Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA., Huttley GA; Research School of Biology, Australian National University, 46 Sullivans Creek Road, Acton ACT, 2601, Australia. gavin.huttley@anu.edu.au., Gregory Caporaso J; The Pathogen and Microbiome Institute, Northern Arizona University, PO Box 4073, Flagstaff, AZ, 86011-4073, USA. gregcaporaso@gmail.com.; Department of Biological Sciences, Northern Arizona University, 1298 S Knoles Drive, Building 56, 3rd Floor, Flagstaff, AZ, USA. gregcaporaso@gmail.com.
المصدر: Microbiome [Microbiome] 2018 May 17; Vol. 6 (1), pp. 90. Date of Electronic Publication: 2018 May 17.
نوع المنشور: Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, Non-P.H.S.
اللغة: English
بيانات الدورية: Publisher: BioMed Central Country of Publication: England NLM ID: 101615147 Publication Model: Electronic Cited Medium: Internet ISSN: 2049-2618 (Electronic) Linking ISSN: 20492618 NLM ISO Abbreviation: Microbiome Subsets: MEDLINE
أسماء مطبوعة: Original Publication: London: BioMed Central, 2013-
مواضيع طبية MeSH: Computer Simulation*, Bacteria/*genetics , DNA, Intergenic/*genetics , Fungi/*genetics , Microbiota/*genetics , RNA, Ribosomal, 16S/*genetics , Sequence Alignment/*methods, Algorithms ; Base Sequence/genetics ; Machine Learning ; Software
مستخلص: Background: Taxonomic classification of marker-gene sequences is an important step in microbiome analysis.
Results: We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ).
Conclusions: Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.
References: PLoS One. 2012;7(3):e32491. (PMID: 22403664)
Cell. 2013 Jan 17;152(1-2):39-50. (PMID: 23332745)
ISME J. 2012 Jul;6(7):1440-4. (PMID: 22237546)
Bioinformatics. 2012 Dec 15;28(24):3211-7. (PMID: 23071270)
Nat Methods. 2016 Jul;13(7):581-3. (PMID: 27214047)
Nucleic Acids Res. 2010 Dec;38(22):e200. (PMID: 20880993)
Nat Biotechnol. 2016 Sep;34(9):942-9. (PMID: 27454739)
Mycologia. 2016 Jan-Feb;108(1):1-5. (PMID: 26553774)
mSystems. 2016 Oct 18;1(5):. (PMID: 27822553)
ISME J. 2012 Aug;6(8):1621-4. (PMID: 22402401)
J Bacteriol. 1991 Jan;173(2):697-703. (PMID: 1987160)
Nucleic Acids Res. 2007;35(18):e120. (PMID: 17881377)
FEMS Microbiol Ecol. 2012 Dec;82(3):666-77. (PMID: 22738186)
Appl Environ Microbiol. 2007 Aug;73(16):5261-7. (PMID: 17586664)
Appl Environ Microbiol. 2013 Apr;79(8):2519-26. (PMID: 23377949)
PeerJ. 2016 Oct 18;4:e2584. (PMID: 27781170)
Nat Methods. 2010 May;7(5):335-6. (PMID: 20383131)
PLoS One. 2015 Feb 03;10(2):e0116106. (PMID: 25646627)
J Mol Biol. 1990 Oct 5;215(3):403-10. (PMID: 2231712)
Nat Methods. 2017 Nov;14 (11):1063-1071. (PMID: 28967888)
MBio. 2013 Sep 17;4(5):e00592-13. (PMID: 24045641)
Nucleic Acids Res. 2015 Mar 31;43(6):e37. (PMID: 25586220)
Nucleic Acids Res. 2008 Oct;36(18):e120. (PMID: 18723574)
Nature. 2017 Nov 23;551(7681):457-463. (PMID: 29088705)
Bioinformatics. 2010 Oct 1;26(19):2460-1. (PMID: 20709691)
PLoS One. 2012;7(11):e49334. (PMID: 23145153)
Nucleic Acids Res. 2017 Feb 28;45(4):e23. (PMID: 27980100)
BMC Bioinformatics. 2009 Dec 15;10:421. (PMID: 20003500)
Mol Ecol. 2013 Nov;22(21):5271-7. (PMID: 24112409)
IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1334-9. (PMID: 24384717)
ISME J. 2012 Mar;6(3):610-8. (PMID: 22134646)
Nat Methods. 2013 Jan;10(1):57-9. (PMID: 23202435)
Appl Environ Microbiol. 1993 Mar;59(3):695-700. (PMID: 7683183)
Appl Environ Microbiol. 2016 Nov 21;82(24):7217-7226. (PMID: 27736792)
Nature. 2012 Jun 13;486(7402):215-21. (PMID: 22699610)
معلومات مُعتمدة: U54 CA143924 United States CA NCI NIH HHS; U54 CA143925 United States CA NCI NIH HHS
المشرفين على المادة: 0 (DNA, Intergenic)
0 (RNA, Ribosomal, 16S)
تواريخ الأحداث: Date Created: 20180519 Date Completed: 20190111 Latest Revision: 20221017
رمز التحديث: 20221213
مُعرف محوري في PubMed: PMC5956843
DOI: 10.1186/s40168-018-0470-z
PMID: 29773078
قاعدة البيانات: MEDLINE
الوصف
تدمد:2049-2618
DOI:10.1186/s40168-018-0470-z