دورية أكاديمية

Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data.

التفاصيل البيبلوغرافية
العنوان: Gene Model Annotations for Drosophila melanogaster: Impact of High-Throughput Data.
المؤلفون: Matthews BB; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138 bmatthew@morgan.harvard.edu., Dos Santos G; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Crosby MA; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Emmert DB; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., St Pierre SE; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Gramates LS; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Zhou P; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Schroeder AJ; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Falls K; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Strelets V; Department of Biology, Indiana University, Bloomington, Indiana 47405., Russo SM; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138., Gelbart WM; Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138.
مؤلفون مشاركون: FlyBase Consortium
المصدر: G3 (Bethesda, Md.) [G3 (Bethesda)] 2015 Jun 24; Vol. 5 (8), pp. 1721-36. Date of Electronic Publication: 2015 Jun 24.
نوع المنشور: Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't
اللغة: English
بيانات الدورية: Publisher: Oxford University Press Country of Publication: England NLM ID: 101566598 Publication Model: Electronic Cited Medium: Internet ISSN: 2160-1836 (Electronic) Linking ISSN: 21601836 NLM ISO Abbreviation: G3 (Bethesda) Subsets: MEDLINE
أسماء مطبوعة: Publication: 2021- : [Oxford] : Oxford University Press
Original Publication: Bethesda, MD : Genetics Society of America, 2011-
مواضيع طبية MeSH: Molecular Sequence Annotation*, Drosophila melanogaster/*genetics, 3' Untranslated Regions ; Animals ; Databases, Genetic ; Exons ; Female ; Male ; Models, Genetic ; RNA, Small Untranslated/chemistry ; RNA, Small Untranslated/metabolism ; Sequence Analysis, RNA ; Transcription Initiation Site ; Transcriptome
مستخلص: We report the current status of the FlyBase annotated gene set for Drosophila melanogaster and highlight improvements based on high-throughput data. The FlyBase annotated gene set consists entirely of manually annotated gene models, with the exception of some classes of small non-coding RNAs. All gene models have been reviewed using evidence from high-throughput datasets, primarily from the modENCODE project. These datasets include RNA-Seq coverage data, RNA-Seq junction data, transcription start site profiles, and translation stop-codon read-through predictions. New annotation guidelines were developed to take into account the use of the high-throughput data. We describe how this flood of new data was incorporated into thousands of new and revised annotations. FlyBase has adopted a philosophy of excluding low-confidence and low-frequency data from gene model annotations; we also do not attempt to represent all possible permutations for complex and modularly organized genes. This has allowed us to produce a high-confidence, manageable gene annotation dataset that is available at FlyBase (http://flybase.org). Interesting aspects of new annotations include new genes (coding, non-coding, and antisense), many genes with alternative transcripts with very long 3' UTRs (up to 15-18 kb), and a stunning mismatch in the number of male-specific genes (approximately 13% of all annotated gene models) vs. female-specific genes (less than 1%). The number of identified pseudogenes and mutations in the sequenced strain also increased significantly. We discuss remaining challenges, for instance, identification of functional small polypeptides and detection of alternative translation starts.
(Copyright © 2015 Matthews et al.)
References: Dev Cell. 2012 Jul 17;23(1):202-9. (PMID: 22814608)
Cell Rep. 2012 Mar 29;1(3):277-89. (PMID: 22685694)
Genome Res. 2013 Jan;23(1):169-80. (PMID: 22936248)
PLoS One. 2013;8(2):e55070. (PMID: 23457459)
Development. 2013 Jul;140(13):2828-34. (PMID: 23698349)
Science. 2013 Sep 6;341(6150):1116-20. (PMID: 23970561)
Nucleic Acids Res. 2014 Jan;42(Database issue):D771-9. (PMID: 24316575)
PLoS Genet. 2010 Aug;6(8):e1001064. (PMID: 20808886)
Trends Genet. 2010 Dec;26(12):499-509. (PMID: 20934772)
Nucleic Acids Res. 2011 Jan;39(Database issue):D152-7. (PMID: 21037258)
Science. 2010 Dec 24;330(6012):1787-97. (PMID: 21177974)
Genome Res. 2011 Feb;21(2):315-24. (PMID: 21177959)
Genome Res. 2012 Sep;22(9):1760-74. (PMID: 22955987)
G3 (Bethesda). 2014 Mar;4(3):485-96. (PMID: 24429422)
Nucleic Acids Res. 2015 Jan;43(Database issue):D690-7. (PMID: 25398896)
Proc Natl Acad Sci U S A. 2003 Nov 25;100 Suppl 2:14537-42. (PMID: 14608037)
Genome Res. 2014 Jul;24(7):1236-50. (PMID: 24985917)
Biochem Soc Trans. 2014 Aug;42(4):1174-9. (PMID: 25110021)
Nature. 2014 Aug 28;512(7515):393-9. (PMID: 24670639)
Cell Rep. 2014 Sep 11;8(5):1365-79. (PMID: 25159147)
Proc Natl Acad Sci U S A. 2014 Sep 16;111(37):13361-6. (PMID: 25157146)
Bioessays. 2015 Jan;37(1):103-12. (PMID: 25345765)
Genome Res. 2015 Mar;25(3):445-58. (PMID: 25589440)
G3 (Bethesda). 2015 Aug;5(8):1737-49. (PMID: 26109356)
Science. 2000 Mar 24;287(5461):2196-204. (PMID: 10731133)
Genome Biol. 2002;3(12):RESEARCH0080. (PMID: 12537569)
Genome Biol. 2002;3(12):RESEARCH0083. (PMID: 12537572)
Nucleic Acids Res. 2003 Feb 1;31(3):1033-7. (PMID: 12560500)
Mol Biol Cell. 2004 Feb;15(2):600-10. (PMID: 14617811)
Genetica. 2004 Jun;121(2):165-79. (PMID: 15330116)
EMBO J. 1987 Dec 20;6(13):4095-104. (PMID: 2832151)
Genetics. 1994 May;137(1):139-50. (PMID: 8056305)
Genetics. 1994 Jul;137(3):803-13. (PMID: 7916308)
Insect Biochem Mol Biol. 1997 Oct;27(10):825-34. (PMID: 9474779)
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. (PMID: 15608248)
Mol Biol Cell. 2005 Mar;16(3):1406-16. (PMID: 15647372)
Proc Natl Acad Sci U S A. 2005 Apr 12;102(15):5495-500. (PMID: 15809421)
Science. 2005 Jul 29;309(5735):764-7. (PMID: 16051794)
Genes Cells. 2005 Dec;10(12):1163-73. (PMID: 16324153)
Cell. 2006 Aug 11;126(3):559-69. (PMID: 16901788)
Cell. 2007 Mar 23;128(6):1089-103. (PMID: 17346786)
PLoS Biol. 2007 May;5(5):e106. (PMID: 17439302)
Nat Cell Biol. 2007 Jun;9(6):660-5. (PMID: 17486114)
FEBS Lett. 2007 Jun 19;581(15):2845-53. (PMID: 17531985)
Nature. 2007 Nov 8;450(7167):203-18. (PMID: 17994087)
Genome Res. 2007 Dec;17(12):1823-36. (PMID: 17989253)
BMC Genomics. 2008;9:61. (PMID: 18237443)
Nat Struct Mol Biol. 2008 Jun;15(6):581-90. (PMID: 18500351)
BMC Dev Biol. 2008;8:55. (PMID: 18485238)
Nat Methods. 2008 Jul;5(7):621-8. (PMID: 18516045)
PLoS One. 2009;4(3):e5040. (PMID: 19325906)
Genome Res. 2009 May;19(5):886-96. (PMID: 19411605)
Nature. 2009 Jun 18;459(7249):927-30. (PMID: 19536255)
Genome Res. 2009 Jul;19(7):1289-300. (PMID: 19458021)
J Insect Sci. 2009;9:18. (PMID: 19613461)
Differentiation. 2009 Dec;78(5):312-20. (PMID: 19720447)
Science. 2010 Jan 15;327(5963):335-8. (PMID: 20007866)
Genesis. 2010 Mar;48(3):161-70. (PMID: 20095054)
Development. 2010 Sep 1;137(17):2951-60. (PMID: 20667912)
Genome Res. 2011 Feb;21(2):182-92. (PMID: 21177961)
Genome Res. 2011 Feb;21(2):203-15. (PMID: 21177969)
Nature. 2011 Mar 24;471(7339):473-9. (PMID: 21179090)
Nucleic Acids Res. 2011 Mar;39(6):2393-403. (PMID: 21075793)
Bioinformatics. 2011 Jul 1;27(13):i275-82. (PMID: 21685081)
Proc Natl Acad Sci U S A. 2011 Sep 20;108(38):15864-9. (PMID: 21896737)
PLoS Genet. 2011 Oct;7(10):e1002337. (PMID: 22028673)
Cell. 2011 Nov 11;147(4):789-802. (PMID: 22056041)
Genome Res. 2011 Dec;21(12):2096-113. (PMID: 21994247)
Genome Biol Evol. 2012;4(4):427-42. (PMID: 22403033)
PLoS Genet. 2012;8(4):e1002671. (PMID: 22532808)
Genome Biol. 2011;12(11):R118. (PMID: 22118156)
معلومات مُعتمدة: U41 HG00739 United States HG NHGRI NIH HHS; U41 HG000739 United States HG NHGRI NIH HHS; G1000968 United Kingdom MRC_ Medical Research Council; P41 HG000739 United States HG NHGRI NIH HHS; (G1000968 United Kingdom MRC_ Medical Research Council
فهرسة مساهمة: Keywords: alternative splice; exon junction; lncRNA; transcription start site; transcriptome
المشرفين على المادة: 0 (3' Untranslated Regions)
0 (RNA, Small Untranslated)
تواريخ الأحداث: Date Created: 20150626 Date Completed: 20160512 Latest Revision: 20220129
رمز التحديث: 20240628
مُعرف محوري في PubMed: PMC4528329
DOI: 10.1534/g3.115.018929
PMID: 26109357
قاعدة البيانات: MEDLINE
الوصف
تدمد:2160-1836
DOI:10.1534/g3.115.018929