CRMnet: a deep learning model for predicting gene expression from large regulatory sequence datasets

التفاصيل البيبلوغرافية
العنوان: CRMnet: a deep learning model for predicting gene expression from large regulatory sequence datasets
المؤلفون: Ke Ding, Gunjan Dixit, Brian J. Parker, Jiayu Wen
بيانات النشر: Cold Spring Harbor Laboratory, 2022.
سنة النشر: 2022
مصطلحات موضوعية: Artificial Intelligence, Computer Science (miscellaneous), Information Systems
الوصف: Recent large datasets measuring the gene expression of millions of possible gene promoter sequences provide a resource to design and train optimised deep neural network architectures to predict expression from sequences. High predictive performance due to the modelling of dependencies within and between regulatory sequences is an enabler for biological discoveries in gene regulation through model interpretation techniques.To understand the regulatory code that delineates gene expression, we have designed a novel deep-learning model (CRMnet) to predict gene expression inSaccharomyces cerevisiae. Our model outperforms the current benchmark models and achieves a Pearson correlation coefficient of 0.971. Interpretation of informative genomic regions determined from model saliency maps, and overlapping the saliency maps with known yeast motifs, support that our model can successfully locate the binding sites of transcription factors that actively modulate gene expression. We compare our model’s training times on a large compute cluster with GPUs and Google TPUs to indicate practical training times on similar datasets.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1c8db8d6427ff696acaf7d2b526b222c
https://doi.org/10.1101/2022.12.02.518786
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....1c8db8d6427ff696acaf7d2b526b222c
قاعدة البيانات: OpenAIRE