دورية أكاديمية

Predicting protein model correctness in Coot using machine learning.

التفاصيل البيبلوغرافية
العنوان: Predicting protein model correctness in Coot using machine learning.
المؤلفون: Bond PS; Department of Chemistry, University of York, York YO10 5DD, United Kingdom., Wilson KS; Department of Chemistry, University of York, York YO10 5DD, United Kingdom., Cowtan KD; Department of Chemistry, University of York, York YO10 5DD, United Kingdom.
المصدر: Acta crystallographica. Section D, Structural biology [Acta Crystallogr D Struct Biol] 2020 Aug 01; Vol. 76 (Pt 8), pp. 713-723. Date of Electronic Publication: 2020 Jul 27.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: John Wiley & Sons Country of Publication: United States NLM ID: 101676043 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 2059-7983 (Electronic) Linking ISSN: 20597983 NLM ISO Abbreviation: Acta Crystallogr D Struct Biol Subsets: MEDLINE
أسماء مطبوعة: Publication: <2018-> : Medford, MA : John Wiley & Sons
Original Publication: [Malden, MA] : John Wiley & Sons, [2016]-
مواضيع طبية MeSH: Machine Learning* , Models, Molecular* , Protein Conformation* , Software*, Proteins/*chemistry, Crystallography, X-Ray
مستخلص: Manually identifying and correcting errors in protein models can be a slow process, but improvements in validation tools and automated model-building software can contribute to reducing this burden. This article presents a new correctness score that is produced by combining multiple sources of information using a neural network. The residues in 639 automatically built models were marked as correct or incorrect by comparing them with the coordinates deposited in the PDB. A number of features were also calculated for each residue using Coot, including map-to-model correlation, density values, B factors, clashes, Ramachandran scores, rotamer scores and resolution. Two neural networks were created using these features as inputs: one to predict the correctness of main-chain atoms and the other for side chains. The 639 structures were split into 511 that were used to train the neural networks and 128 that were used to test performance. The predicted correctness scores could correctly categorize 92.3% of the main-chain atoms and 87.6% of the side chains. A Coot ML Correctness script was written to display the scores in a graphical user interface as well as for the automatic pruning of chains, residues and side chains with low scores. The automatic pruning function was added to the CCP4i2 Buccaneer automated model-building pipeline, leading to significant improvements, especially for high-resolution structures.
(open access.)
References: Protein Sci. 2018 Jan;27(1):293-315. (PMID: 29067766)
Acta Crystallogr D Biol Crystallogr. 2011 Apr;67(Pt 4):235-42. (PMID: 21460441)
Acta Crystallogr D Biol Crystallogr. 2011 Apr;67(Pt 4):368-75. (PMID: 21460455)
Acta Crystallogr D Biol Crystallogr. 2011 Apr;67(Pt 4):355-67. (PMID: 21460454)
Sci Rep. 2015 Aug 03;5:12698. (PMID: 26237540)
Nucleic Acids Res. 2000 Jan 1;28(1):235-42. (PMID: 10592235)
Acta Crystallogr D Biol Crystallogr. 2003 Jul;59(Pt 7):1131-7. (PMID: 12832755)
Acta Crystallogr D Biol Crystallogr. 2010 Apr;66(Pt 4):486-501. (PMID: 20383002)
Acta Crystallogr D Biol Crystallogr. 2006 Sep;62(Pt 9):1002-11. (PMID: 16929101)
Acta Crystallogr D Biol Crystallogr. 2008 Jan;64(Pt 1):83-9. (PMID: 18094471)
IUCrJ. 2020 Feb 27;7(Pt 2):342-354. (PMID: 32148861)
Acta Crystallogr D Biol Crystallogr. 2011 Apr;67(Pt 4):331-7. (PMID: 21460451)
Proteins. 2003 Feb 15;50(3):437-50. (PMID: 12557186)
Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):143-151. (PMID: 29533240)
Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):68-84. (PMID: 29533233)
Acta Crystallogr D Struct Biol. 2018 Nov 1;74(Pt 11):1096-1104. (PMID: 30387768)
Structure. 2004 Oct;12(10):1753-61. (PMID: 15458625)
Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):125-131. (PMID: 29533238)
Acta Crystallogr D Struct Biol. 2019 Dec 1;75(Pt 12):1119-1128. (PMID: 31793905)
Acta Crystallogr D Struct Biol. 2018 Mar 1;74(Pt 3):215-227. (PMID: 29533229)
Acta Crystallogr D Biol Crystallogr. 2004 Dec;60(Pt 12 Pt 1):2126-32. (PMID: 15572765)
معلومات مُعتمدة: BB/S005099/1 United Kingdom BB_ Biotechnology and Biological Sciences Research Council; BB/M011151/1 United Kingdom BB_ Biotechnology and Biological Sciences Research Council
فهرسة مساهمة: Keywords: Coot; machine learning; model building; software; structure solution; validation
المشرفين على المادة: 0 (Proteins)
تواريخ الأحداث: Date Created: 20200804 Date Completed: 20210709 Latest Revision: 20231111
رمز التحديث: 20240628
مُعرف محوري في PubMed: PMC7397494
DOI: 10.1107/S2059798320009080
PMID: 32744253
قاعدة البيانات: MEDLINE
الوصف
تدمد:2059-7983
DOI:10.1107/S2059798320009080