دورية أكاديمية

Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-Terminal Coding Sequences.

التفاصيل البيبلوغرافية
العنوان: Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-Terminal Coding Sequences.
المؤلفون: Yan Z; School of Computing, National University of Singapore, Singapore 117417, Singapore., Chu W; Science Center for Future Foods, Jiangnan University, Wuxi 214122, PR China., Sheng Y; Science Center for Future Foods, Jiangnan University, Wuxi 214122, PR China., Tang K; School of Computing, National University of Singapore, Singapore 117417, Singapore., Wang S; Department of Mathematics, National University of Singapore, Singapore 119077, Singapore., Liu Y; Science Center for Future Foods, Jiangnan University, Wuxi 214122, PR China., Wong WF; School of Computing, National University of Singapore, Singapore 117417, Singapore.
المصدر: ACS synthetic biology [ACS Synth Biol] 2024 Sep 20; Vol. 13 (9), pp. 2960-2968. Date of Electronic Publication: 2024 Sep 04.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: American Chemical Society Country of Publication: United States NLM ID: 101575075 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 2161-5063 (Electronic) Linking ISSN: 21615063 NLM ISO Abbreviation: ACS Synth Biol Subsets: MEDLINE
أسماء مطبوعة: Original Publication: Washington, D.C. : American Chemical Society, c2012-
مواضيع طبية MeSH: Synthetic Biology*/methods , Green Fluorescent Proteins*/genetics , Green Fluorescent Proteins*/metabolism , Bacillus subtilis*/genetics , Bacillus subtilis*/metabolism , Deep Learning*, Gene Expression/genetics ; Algorithms ; Genetic Engineering/methods
مستخلص: N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology codesigned few-shot training workflow for NCS optimization. Our method utilizes k -nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by Bacillus subtilis as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD 62 ) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD 62 ) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting GNA1 gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use.
فهرسة مساهمة: Keywords: Bacillus subtilis; N-acetylneuraminic acid; N-terminal coding sequence; deep learning; few-shot learning; gene expression regulation
المشرفين على المادة: 147336-22-9 (Green Fluorescent Proteins)
تواريخ الأحداث: Date Created: 20240904 Date Completed: 20240920 Latest Revision: 20240920
رمز التحديث: 20240921
DOI: 10.1021/acssynbio.4c00371
PMID: 39229974
قاعدة البيانات: MEDLINE
الوصف
تدمد:2161-5063
DOI:10.1021/acssynbio.4c00371