دورية أكاديمية

SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome

التفاصيل البيبلوغرافية
العنوان: SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome
المؤلفون: Shaherin Basith, Balachandran Manavalan, Tae Hwan Shin, Gwang Lee
المصدر: Molecular Therapy: Nucleic Acids, Vol 18, Iss , Pp 131-141 (2019)
بيانات النشر: Elsevier, 2019.
سنة النشر: 2019
المجموعة: LCC:Therapeutics. Pharmacology
مصطلحات موضوعية: Therapeutics. Pharmacology, RM1-950
الوصف: DNA N6-adenine methylation (6mA) is an epigenetic modification in prokaryotes and eukaryotes. Identifying 6mA sites in rice genome is important in rice epigenetics and breeding, but non-random distribution and biological functions of these sites remain unclear. Several machine-learning tools can identify 6mA sites but show limited prediction accuracy, which limits their usability in epigenetic research. Here, we developed a novel computational predictor, called the Sequence-based DNA N6-methyladenine predictor (SDM6A), which is a two-layer ensemble approach for identifying 6mA sites in the rice genome. Unlike existing methods, which are based on single models with basic features, SDM6A explores various features, and five encoding methods were identified as appropriate for this problem. Subsequently, an optimal feature set was identified from encodings, and corresponding models were developed individually using support vector machine and extremely randomized tree. First, all five single models were integrated via ensemble approach to define the class for each classifier. Second, two classifiers were integrated to generate a final prediction. SDM6A achieved robust performance on cross-validation and independent evaluation, with average accuracy and Matthews correlation coefficient (MCC) of 88.2% and 0.764, respectively. Corresponding metrics were 4.7%–11.0% and 2.3%–5.5% higher than those of existing methods, respectively. A user-friendly, publicly accessible web server (http://thegleelab.org/SDM6A) was implemented to predict novel putative 6mA sites in rice genome. Keywords: DNA N6-adenine methylation, rice genome, machine learning, support vector machine, extremely randomized tree
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2162-2531
Relation: http://www.sciencedirect.com/science/article/pii/S2162253119302240; https://doaj.org/toc/2162-2531
DOI: 10.1016/j.omtn.2019.08.011
URL الوصول: https://doaj.org/article/b976a4e8214541289e4ac6cc4a836269
رقم الأكسشن: edsdoj.b976a4e8214541289e4ac6cc4a836269
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:21622531
DOI:10.1016/j.omtn.2019.08.011