دورية أكاديمية

Combining structural modeling with ensemble machine learning to accurately predict protein fold stability and binding affinity effects upon mutation.

التفاصيل البيبلوغرافية
العنوان: Combining structural modeling with ensemble machine learning to accurately predict protein fold stability and binding affinity effects upon mutation.
المؤلفون: Niklas Berliner, Joan Teyra, Recep Colak, Sebastian Garcia Lopez, Philip M Kim
المصدر: PLoS ONE, Vol 9, Iss 9, p e107353 (2014)
بيانات النشر: Public Library of Science (PLoS), 2014.
سنة النشر: 2014
المجموعة: LCC:Medicine
LCC:Science
مصطلحات موضوعية: Medicine, Science
الوصف: Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1932-6203
Relation: http://europepmc.org/articles/PMC4170975?pdf=render; https://doaj.org/toc/1932-6203
DOI: 10.1371/journal.pone.0107353
URL الوصول: https://doaj.org/article/6e6fb2cb5bf94ec58f35a250b9f2a400
رقم الأكسشن: edsdoj.6e6fb2cb5bf94ec58f35a250b9f2a400
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:19326203
DOI:10.1371/journal.pone.0107353