دورية أكاديمية

Effusion: prediction of protein function from sequence similarity networks.

التفاصيل البيبلوغرافية
العنوان: Effusion: prediction of protein function from sequence similarity networks.
المؤلفون: Yunes JM; UC Berkeley - UCSF Graduate Program in Bioengineering, University of California, San Francisco, CA, USA.; Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA., Babbitt PC; Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA.; Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA.; Quantitative Biosciences Institute, University of California, San Francisco, CA, USA.
المصدر: Bioinformatics (Oxford, England) [Bioinformatics] 2019 Feb 01; Vol. 35 (3), pp. 442-451.
نوع المنشور: Journal Article; Research Support, N.I.H., Extramural
اللغة: English
بيانات الدورية: Publisher: Oxford University Press Country of Publication: England NLM ID: 9808944 Publication Model: Print Cited Medium: Internet ISSN: 1367-4811 (Electronic) Linking ISSN: 13674803 NLM ISO Abbreviation: Bioinformatics Subsets: MEDLINE
أسماء مطبوعة: Original Publication: Oxford : Oxford University Press, c1998-
مواضيع طبية MeSH: Computational Biology* , Software*, Proteins/*chemistry, Gene Ontology
مستخلص: Motivation: Critical evaluation of methods for protein function prediction shows that data integration improves the performance of methods that predict protein function, but a basic BLAST-based method is still a top contender. We sought to engineer a method that modernizes the classical approach while avoiding pitfalls common to state-of-the-art methods.
Results: We present a method for predicting protein function, Effusion, which uses a sequence similarity network to add context for homology transfer, a probabilistic model to account for the uncertainty in labels and function propagation, and the structure of the Gene Ontology (GO) to best utilize sparse input labels and make consistent output predictions. Effusion's model makes it practical to integrate rare experimental data and abundant primary sequence and sequence similarity. We demonstrate Effusion's performance using a critical evaluation method and provide an in-depth analysis. We also dissect the design decisions we used to address challenges for predicting protein function. Finally, we propose directions in which the framework of the method can be modified for additional predictive power.
Availability and Implementation: The source code for an implementation of Effusion is freely available at https://github.com/babbittlab/effusion.
Supplementary Information: Supplementary data are available at Bioinformatics online.
References: Bioinformatics. 2006 Apr 1;22(7):830-6. (PMID: 16410319)
J Mol Biol. 2002 Apr 26;318(2):595-608. (PMID: 12051862)
Bioinformatics. 2013 Jul 01;29(13):i53-61. (PMID: 23813009)
J Mol Biol. 2003 Oct 31;333(4):863-82. (PMID: 14568541)
IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):775-84. (PMID: 21393654)
Bioinformatics. 2012 Nov 1;28(21):2845-6. (PMID: 22962345)
J Biol Chem. 2014 Oct 31;289(44):30221-30228. (PMID: 25210038)
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. (PMID: 9254694)
Math Biosci. 2005 Feb;193(2):223-34. (PMID: 15748731)
Nucleic Acids Res. 2017 Jan 4;45(D1):D331-D338. (PMID: 27899567)
Genome Biol. 2008;9 Suppl 1:S6. (PMID: 18613950)
PLoS One. 2013 Nov 12;8(11):e78383. (PMID: 24265686)
BMC Bioinformatics. 2013;14 Suppl 3:S7. (PMID: 23514582)
Curr Opin Chem Biol. 2011 Jun;15(3):435-42. (PMID: 21489855)
Nucleic Acids Res. 2013 Jan;41(Database issue):D536-44. (PMID: 23161684)
J Mol Biol. 2001 Apr 6;307(4):1113-43. (PMID: 11286560)
Nat Genet. 2000 May;25(1):25-9. (PMID: 10802651)
PLoS One. 2010 Feb 24;5(2):e9293. (PMID: 20195360)
J Biol Chem. 2012 Jan 2;287(1):35-42. (PMID: 22069325)
PLoS Comput Biol. 2009 Dec;5(12):e1000605. (PMID: 20011109)
Mol Syst Biol. 2007;3:88. (PMID: 17353930)
Genome Res. 2003 Nov;13(11):2498-504. (PMID: 14597658)
Nucleic Acids Res. 2015 Jul 1;43(W1):W141-7. (PMID: 25979264)
Annu Rev Biochem. 2001;70:209-46. (PMID: 11395407)
J Comput Biol. 2004;11(2-3):463-75. (PMID: 15285902)
Genome Res. 2011 Nov;21(11):1969-80. (PMID: 21784873)
Bioinformatics. 2003;19 Suppl 1:i197-204. (PMID: 12855458)
Ann N Y Acad Sci. 2012 Jul;1260:95-100. (PMID: 22268703)
Bioinformatics. 2005 Sep 15;21(18):3674-6. (PMID: 16081474)
Genome Biol. 2016 Sep 07;17(1):184. (PMID: 27604469)
PLoS One. 2009;4(2):e4345. (PMID: 19190775)
BMC Bioinformatics. 2004 Nov 18;5:178. (PMID: 15550167)
BMC Bioinformatics. 2008 Aug 22;9:350. (PMID: 18721473)
Nat Methods. 2015 Jan;12(1):59-60. (PMID: 25402007)
J Biol Chem. 2018 Feb 16;293(7):2342-2357. (PMID: 29184004)
Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. (PMID: 22127870)
Bioinformatics. 2006 Aug 1;22(15):1871-8. (PMID: 16705013)
PLoS Comput Biol. 2013;9(5):e1003063. (PMID: 23737737)
Bioinformatics. 2013 Oct 15;29(20):2647-8. (PMID: 23918248)
Nat Methods. 2013 Mar;10(3):221-7. (PMID: 23353650)
Brief Bioinform. 2006 Sep;7(3):225-42. (PMID: 16772267)
معلومات مُعتمدة: R01 GM060595 United States GM NIGMS NIH HHS
المشرفين على المادة: 0 (Proteins)
تواريخ الأحداث: Date Created: 20180808 Date Completed: 20191104 Latest Revision: 20240714
رمز التحديث: 20240714
مُعرف محوري في PubMed: PMC6361244
DOI: 10.1093/bioinformatics/bty672
PMID: 30084920
قاعدة البيانات: MEDLINE
الوصف
تدمد:1367-4811
DOI:10.1093/bioinformatics/bty672