دورية أكاديمية

Variable selection in microbiome compositional data analysis.

التفاصيل البيبلوغرافية
العنوان: Variable selection in microbiome compositional data analysis.
المؤلفون: Susin A; Mathematical Department, UPC-Barcelona Tech, 08028 Barcelona, Spain., Wang Y; Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC 3010, Australia., Lê Cao KA; Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC 3010, Australia., Calle ML; Biosciences Department, Faculty of Sciences and Technology, University of Vic-Central University of Catalonia, Carrer de la Laura, 13, 08500 Vic, Spain.
المصدر: NAR genomics and bioinformatics [NAR Genom Bioinform] 2020 May 13; Vol. 2 (2), pp. lqaa029. Date of Electronic Publication: 2020 May 13 (Print Publication: 2020).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Oxford University Press Country of Publication: England NLM ID: 101756213 Publication Model: eCollection Cited Medium: Internet ISSN: 2631-9268 (Electronic) Linking ISSN: 26319268 NLM ISO Abbreviation: NAR Genom Bioinform Subsets: PubMed not MEDLINE
أسماء مطبوعة: Original Publication: [Oxford] : Oxford University Press, [2019]-
مستخلص: Though variable selection is one of the most relevant tasks in microbiome analysis, e.g. for the identification of microbial signatures, many studies still rely on methods that ignore the compositional nature of microbiome data. The applicability of compositional data analysis methods has been hampered by the availability of software and the difficulty in interpreting their results. This work is focused on three methods for variable selection that acknowledge the compositional structure of microbiome data: selbal , a forward selection approach for the identification of compositional balances, and clr-lasso and coda-lasso , two penalized regression models for compositional data analysis. This study highlights the link between these methods and brings out some limitations of the centered log-ratio transformation for variable selection. In particular, the fact that it is not subcompositionally consistent makes the microbial signatures obtained from clr-lasso not readily transferable. Coda-lasso is computationally efficient and suitable when the focus is the identification of the most associated microbial taxa. Selbal stands out when the goal is to obtain a parsimonious model with optimal prediction performance, but it is computationally greedy. We provide a reproducible vignette for the application of these methods that will enable researchers to fully leverage their potential in microbiome studies.
(© The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.)
References: J Stat Softw. 2010;33(1):1-22. (PMID: 20808728)
mSystems. 2020 Apr 7;5(2):. (PMID: 32265314)
Gut. 2017 May;66(5):813-822. (PMID: 28179361)
Nat Commun. 2019 Jun 20;10(1):2719. (PMID: 31222023)
Front Microbiol. 2017 Nov 15;8:2224. (PMID: 29187837)
Biometrics. 2019 Mar;75(1):235-244. (PMID: 30039859)
PeerJ. 2017 Feb 9;5:e2969. (PMID: 28289558)
mSystems. 2018 Jul 17;3(4):. (PMID: 30035234)
Biosci Microbiota Food Health. 2019;38(2):55-64. (PMID: 31106108)
Cell Host Microbe. 2014 Mar 12;15(3):382-392. (PMID: 24629344)
PLoS One. 2014 May 21;9(5):e97500. (PMID: 24848969)
Genome Med. 2016 Jul 13;8(1):75. (PMID: 27412252)
Genomics Inform. 2019 Mar;17(1):e6. (PMID: 30929407)
Microbiome. 2017 Mar 3;5(1):27. (PMID: 28253908)
Genome Biol. 2014;15(12):550. (PMID: 25516281)
Ann Epidemiol. 2016 May;26(5):322-9. (PMID: 27143475)
Gastroenterology. 2009 Nov;137(5):1716-24.e1-2. (PMID: 19706296)
Proc Natl Acad Sci U S A. 2009 Feb 17;106(7):2365-70. (PMID: 19164560)
J Nutr Biochem. 2019 Jul;69:130-138. (PMID: 31078906)
PLoS One. 2012;7(12):e52078. (PMID: 23284876)
Microbiome. 2016 Nov 25;4(1):62. (PMID: 27884206)
Nat Rev Microbiol. 2010 Jan;8(1):15-25. (PMID: 19946288)
Microbiome. 2016 Jul 07;4(1):36. (PMID: 27388460)
Genome Biol. 2010;11(3):R25. (PMID: 20196867)
Microb Ecol Health Dis. 2015 May 29;26:27663. (PMID: 26028277)
Curr Oncol Rep. 2014 Oct;16(10):406. (PMID: 25123079)
Inflamm Bowel Dis. 2015 Jun;21(6):1219-28. (PMID: 25844959)
Nat Methods. 2018 Oct;15(10):796-798. (PMID: 30275573)
Bioinformatics. 2010 Jan 1;26(1):139-40. (PMID: 19910308)
Nat Methods. 2013 Dec;10(12):1200-2. (PMID: 24076764)
Elife. 2017 Feb 15;6:. (PMID: 28198697)
Genome Biol. 2011 Jun 24;12(6):R60. (PMID: 21702898)
Gigascience. 2019 Sep 1;8(9):. (PMID: 31544212)
Can J Microbiol. 2016 Aug;62(8):692-703. (PMID: 27314511)
PLoS One. 2014 Aug 18;9(8):e103398. (PMID: 25133574)
Bioinformatics. 2018 Aug 15;34(16):2870-2878. (PMID: 29608657)
Sci Rep. 2018 Mar 20;8(1):4907. (PMID: 29559675)
PLoS One. 2016 Aug 11;11(8):e0160169. (PMID: 27513472)
Int J Syst Evol Microbiol. 2015 Mar;65(Pt 3):870-878. (PMID: 25519299)
Cell Host Microbe. 2011 Oct 20;10(4):292-6. (PMID: 22018228)
mSystems. 2017 Jan 17;2(1):. (PMID: 28144630)
Microb Ecol Health Dis. 2015 Feb 02;26:26191. (PMID: 25651997)
PLoS One. 2013 Jul 02;8(7):e67019. (PMID: 23843979)
J Nutr Biochem. 2016 Sep;35:30-36. (PMID: 27362974)
تواريخ الأحداث: Date Created: 20210212 Latest Revision: 20240330
رمز التحديث: 20240330
مُعرف محوري في PubMed: PMC7671404
DOI: 10.1093/nargab/lqaa029
PMID: 33575585
قاعدة البيانات: MEDLINE
الوصف
تدمد:2631-9268
DOI:10.1093/nargab/lqaa029