دورية أكاديمية

CRMnet: A deep learning model for predicting gene expression from large regulatory sequence datasets

التفاصيل البيبلوغرافية
العنوان: CRMnet: A deep learning model for predicting gene expression from large regulatory sequence datasets
المؤلفون: Ke Ding, Gunjan Dixit, Brian J. Parker, Jiayu Wen
المصدر: Frontiers in Big Data, Vol 6 (2023)
بيانات النشر: Frontiers Media S.A., 2023.
سنة النشر: 2023
المجموعة: LCC:Information technology
مصطلحات موضوعية: deep learning, big data, gene expression, yeast, genomics, HPC, Information technology, T58.5-58.64
الوصف: Recent large datasets measuring the gene expression of millions of possible gene promoter sequences provide a resource to design and train optimized deep neural network architectures to predict expression from sequences. High predictive performance due to the modeling of dependencies within and between regulatory sequences is an enabler for biological discoveries in gene regulation through model interpretation techniques. To understand the regulatory code that delineates gene expression, we have designed a novel deep-learning model (CRMnet) to predict gene expression in Saccharomyces cerevisiae. Our model outperforms the current benchmark models and achieves a Pearson correlation coefficient of 0.971 and a mean squared error of 3.200. Interpretation of informative genomic regions determined from model saliency maps, and overlapping the saliency maps with known yeast motifs, supports that our model can successfully locate the binding sites of transcription factors that actively modulate gene expression. We compare our model's training times on a large compute cluster with GPUs and Google TPUs to indicate practical training times on similar datasets.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2624-909X
Relation: https://www.frontiersin.org/articles/10.3389/fdata.2023.1113402/full; https://doaj.org/toc/2624-909X
DOI: 10.3389/fdata.2023.1113402
URL الوصول: https://doaj.org/article/cace88912d0c4c69a50d87c4e313777a
رقم الأكسشن: edsdoj.88912d0c4c69a50d87c4e313777a
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:2624909X
DOI:10.3389/fdata.2023.1113402