Machine learning, transcriptome, and genotyping chip analyses provide insights into SNP markers identifying flower color in Platycodon grandiflorus

التفاصيل البيبلوغرافية
العنوان: Machine learning, transcriptome, and genotyping chip analyses provide insights into SNP markers identifying flower color in Platycodon grandiflorus
المؤلفون: Chuloh Cho, Younhee Shin, Go-Eun Yu, Sang-Ho Kang, Si-Myung Lee, Chang-Kug Kim, Sathiyamoorthy Subramaniyam, Seung-Sik Lee
المصدر: Scientific Reports, Vol 11, Iss 1, Pp 1-9 (2021)
Scientific Reports
بيانات النشر: Nature Portfolio, 2021.
سنة النشر: 2021
مصطلحات موضوعية: 0106 biological sciences, 0301 basic medicine, Platycodon, Genotype, Science, Feature selection, Single-nucleotide polymorphism, Platycodon grandiflorus, Biology, Machine learning, computer.software_genre, 01 natural sciences, Polymorphism, Single Nucleotide, Article, Machine Learning, 03 medical and health sciences, symbols.namesake, SNP, Genotyping, Selection (genetic algorithm), Sanger sequencing, Multidisciplinary, business.industry, biology.organism_classification, Random forest, Computational biology and bioinformatics, 030104 developmental biology, symbols, Medicine, Artificial intelligence, business, Transcriptome, Plant sciences, computer, 010606 plant biology & botany
الوصف: Bellflower is an edible ornamental gardening plant in Asia. For predicting the flower color in bellflower plants, a transcriptome-wide approach based on machine learning, transcriptome, and genotyping chip analyses was used to identify SNP markers. Six machine learning methods were deployed to explore the classification potential of the selected SNPs as features in two datasets, namely training (60 RNA-Seq samples) and validation (480 Fluidigm chip samples). SNP selection was performed in sequential order. Firstly, 96 SNPs were selected from the transcriptome-wide SNPs using the principal compound analysis (PCA). Then, 9 among 96 SNPs were later identified using the Random forest based feature selection method from the Fluidigm chip dataset. Among six machines, the random forest (RF) model produced higher classification performance than the other models. The 9 SNP marker candidates selected for classifying the flower color classification were verified using the genomic DNA PCR with Sanger sequencing. Our results suggest that this methodology could be used for future selection of breeding traits even though the plant accessions are highly heterogeneous.
اللغة: English
تدمد: 2045-2322
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::86286b501f62b153927929e50de26720
https://doaj.org/article/315537c8b84243b4b178f8d17ea3d7d7
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....86286b501f62b153927929e50de26720
قاعدة البيانات: OpenAIRE