دورية أكاديمية

Improving Enzyme Optimum Temperature Prediction with Resampling Strategies and Ensemble Learning.

التفاصيل البيبلوغرافية
العنوان: Improving Enzyme Optimum Temperature Prediction with Resampling Strategies and Ensemble Learning.
المؤلفون: Gado JE; Department of Chemical and Materials Engineering, University of Kentucky, Lexington, Kentucky 40506, United States.; National Bioenergy Center, National Renewable Energy Laboratory, Golden, Colorado 80401, United States., Beckham GT; National Bioenergy Center, National Renewable Energy Laboratory, Golden, Colorado 80401, United States., Payne CM; Department of Chemical and Materials Engineering, University of Kentucky, Lexington, Kentucky 40506, United States.
المصدر: Journal of chemical information and modeling [J Chem Inf Model] 2020 Aug 24; Vol. 60 (8), pp. 4098-4107. Date of Electronic Publication: 2020 Jul 22.
نوع المنشور: Journal Article; Research Support, U.S. Gov't, Non-P.H.S.
اللغة: English
بيانات الدورية: Publisher: American Chemical Society Country of Publication: United States NLM ID: 101230060 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1549-960X (Electronic) Linking ISSN: 15499596 NLM ISO Abbreviation: J Chem Inf Model Subsets: MEDLINE
أسماء مطبوعة: Original Publication: Washington, D.C. : American Chemical Society, c2005-
مواضيع طبية MeSH: Machine Learning*, Temperature
مستخلص: Accurate prediction of the optimal catalytic temperature ( T opt ) of enzymes is vital in biotechnology, as enzymes with high T opt values are desired for enhanced reaction rates. Recently, a machine learning method (temperature optima for microorganisms and enzymes, TOME) for predicting T opt was developed. TOME was trained on a normally distributed data set with a median T opt of 37 °C and less than 5% of T opt values above 85 °C, limiting the method's predictive capabilities for thermostable enzymes. Due to the distribution of the training data, the mean squared error on T opt values greater than 85 °C is nearly an order of magnitude higher than the error on values between 30 and 50 °C. In this study, we apply ensemble learning and resampling strategies that tackle the data imbalance to significantly decrease the error on high T opt values (>85 °C) by 60% and increase the overall R 2 value from 0.527 to 0.632. The revised method, temperature optima for enzymes with resampling (TOMER), and the resampling strategies applied in this work are freely available to other researchers as Python packages on GitHub.
تواريخ الأحداث: Date Created: 20200709 Date Completed: 20210617 Latest Revision: 20210617
رمز التحديث: 20231215
DOI: 10.1021/acs.jcim.0c00489
PMID: 32639729
قاعدة البيانات: MEDLINE
الوصف
تدمد:1549-960X
DOI:10.1021/acs.jcim.0c00489