دورية أكاديمية

A composite genome approach to identify phylogenetically informative data from next-generation sequencing.

التفاصيل البيبلوغرافية
العنوان: A composite genome approach to identify phylogenetically informative data from next-generation sequencing.
المؤلفون: Schwartz RS; The Biodesign Institute, Arizona State University, Tempe, AZ, USA. Rachel.Schwartz@asu.edu., Harkins KM; School of Human Evolution and Social Change, Arizona State University, Tempe, AZ, USA. kmharkin@ucsc.edu.; Department of Anthropology, University of California - Santa Cruz, Santa Cruz, CA, USA. kmharkin@ucsc.edu., Stone AC; School of Human Evolution and Social Change, Arizona State University, Tempe, AZ, USA. acstone@asu.edu., Cartwright RA; The Biodesign Institute, Arizona State University, Tempe, AZ, USA. cartwright@asu.edu.; School of Life Sciences, Arizona State University, Tempe, AZ, USA. cartwright@asu.edu.
المصدر: BMC bioinformatics [BMC Bioinformatics] 2015 Jun 11; Vol. 16, pp. 193. Date of Electronic Publication: 2015 Jun 11.
نوع المنشور: Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, Non-P.H.S.
اللغة: English
بيانات الدورية: Publisher: BioMed Central Country of Publication: England NLM ID: 100965194 Publication Model: Electronic Cited Medium: Internet ISSN: 1471-2105 (Electronic) Linking ISSN: 14712105 NLM ISO Abbreviation: BMC Bioinformatics Subsets: MEDLINE
أسماء مطبوعة: Original Publication: [London] : BioMed Central, 2000-
مواضيع طبية MeSH: Phylogeny* , Software*, High-Throughput Nucleotide Sequencing/*methods , Hominidae/*genetics , Mammals/*genetics , Sequence Analysis, DNA/*methods, Animals ; Genome ; Genomics/methods
مستخلص: Background: Improvements in sequencing technology now allow easy acquisition of large datasets; however, analyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain homologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a reference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly, multiple genome alignment, and annotation.
Results: For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic signal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate phylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of placental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using datasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent with the major hypotheses for the relationships among mammals, all of which have been supported previously by different molecular datasets.
Conclusions: SISRS has the potential to transform phylogenetic research. This method eliminates the need for expensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is open source and freely available at https://github.com/rachelss/SISRS/releases.
References: Mol Phylogenet Evol. 2013 Mar;66(3):617-27. (PMID: 23123733)
Nature. 2010 Oct 28;467(7319):1061-73. (PMID: 20981092)
Proc Natl Acad Sci U S A. 2012 Sep 11;109(37):14942-7. (PMID: 22930817)
Mol Biol Evol. 2014 May;31(5):1077-88. (PMID: 24600054)
Syst Biol. 2001 Nov-Dec;50(6):913-25. (PMID: 12116640)
PLoS Biol. 2011 Mar;9(3):e1000602. (PMID: 21423652)
Bioinformatics. 2012 Feb 15;28(4):593-4. (PMID: 22199392)
Biol Lett. 2012 Oct 23;8(5):783-6. (PMID: 22593086)
Science. 2011 Oct 28;334(6055):521-4. (PMID: 21940861)
J Comput Biol. 2012 Aug;19(8):945-56. (PMID: 22876786)
Bioinformatics. 2005 Nov 1;21 Suppl 3:iii31-8. (PMID: 16306390)
Mol Biol Evol. 2013 Sep;30(9):1999-2000. (PMID: 23813980)
Mol Ecol. 2013 Jan;22(1):111-29. (PMID: 23062080)
Science. 2013 Feb 8;339(6120):662-7. (PMID: 23393258)
Syst Biol. 2011 Mar;60(2):117-25. (PMID: 21186249)
PLoS Genet. 2011 Mar;7(3):e1001342. (PMID: 21436896)
Syst Biol. 2012 Oct;61(5):727-44. (PMID: 22605266)
Mol Biol Evol. 2013 Sep;30(9):2134-44. (PMID: 23813978)
Genome Res. 2008 May;18(5):821-9. (PMID: 18349386)
Mol Phylogenet Evol. 2008 Sep;48(3):1013-26. (PMID: 18620872)
Mol Biol Evol. 2013 Sep;30(9):2145-56. (PMID: 23813979)
Nat Rev Genet. 2005 May;6(5):361-75. (PMID: 15861208)
PLoS One. 2013;8(1):e54848. (PMID: 23382987)
Nat Methods. 2012 Mar 04;9(4):357-9. (PMID: 22388286)
Syst Biol. 2013 Sep;62(5):689-706. (PMID: 23652346)
Syst Biol. 2012 Oct;61(5):717-26. (PMID: 22232343)
Philos Trans R Soc Lond B Biol Sci. 1989 Dec 21;326(1233):119-57. (PMID: 2575770)
Nat Commun. 2013;4:1426. (PMID: 23385571)
Mol Biol Evol. 2012 Aug;29(8):1917-32. (PMID: 22422763)
Mol Phylogenet Evol. 2011 Jan;58(1):53-70. (PMID: 20816817)
Mol Ecol. 2013 Jun;22(11):3141-50. (PMID: 23432348)
Bioinformatics. 2006 Nov 1;22(21):2688-90. (PMID: 16928733)
Nature. 2001 Sep 13;413(6852):157-61. (PMID: 11557979)
Genome Res. 2012 Apr;22(4):746-54. (PMID: 22207614)
Syst Biol. 2013 May 1;62(3):424-38. (PMID: 23417680)
معلومات مُعتمدة: R01 GM101352 United States GM NIGMS NIH HHS; R01-GM101352-01A1 United States GM NIGMS NIH HHS
تواريخ الأحداث: Date Created: 20150612 Date Completed: 20150928 Latest Revision: 20200306
رمز التحديث: 20221213
مُعرف محوري في PubMed: PMC4464851
DOI: 10.1186/s12859-015-0632-y
PMID: 26062548
قاعدة البيانات: MEDLINE
الوصف
تدمد:1471-2105
DOI:10.1186/s12859-015-0632-y