Pretraining strategies for effective promoter-driven gene expression prediction

التفاصيل البيبلوغرافية
العنوان: Pretraining strategies for effective promoter-driven gene expression prediction
المؤلفون: Aniketh Janardhan Reddy, Michael H. Herschl, Sathvik Kolli, Amy X. Lu, Xinyang Geng, Aviral Kumar, Patrick D. Hsu, Sergey Levine, Nilah M. Ioannidis
بيانات النشر: Cold Spring Harbor Laboratory, 2023.
سنة النشر: 2023
الوصف: Advances in gene delivery technologies are enabling rapid progress in molecular medicine, but require precise expression of genetic cargo in desired cell types, which is predominantly achieved via a regulatory DNA sequence called a promoter; however, only a handful of cell type-specific promoters are known. Efficiently designing compact promoter sequences with a high density of regulatory information by leveraging machine learning models would therefore be broadly impactful for fundamental research and direct therapeutic applications. However, models of expression from such compact promoter sequences are lacking, despite the recent success of deep learning in modelling expression from endogenous regulatory sequences. Despite the lack of large datasets measuring promoter-driven expression in many cell types, data from a few well-studied cell types or from endogenous gene expression may provide relevant information for transfer learning, which has not yet been explored in this setting. Here, we evaluate a variety of pretraining tasks and transfer strategies for modelling cell type-specific expression from compact promoters and demonstrate the effectiveness of pretraining on existing promoter-driven expression datasets from other cell types. Our approach is broadly applicable for modelling promoter-driven expression in any data-limited cell type of interest, and will enable the use of model-based optimization techniques for promoter design for gene delivery applications. Our code and data are available athttps://github.com/anikethjr/promoter_models.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::e9e641a78af0c16c2e7d90c51f963f34
https://doi.org/10.1101/2023.02.24.529941
رقم الأكسشن: edsair.doi...........e9e641a78af0c16c2e7d90c51f963f34
قاعدة البيانات: OpenAIRE