دورية أكاديمية

CDBProm: the Comprehensive Directory of Bacterial Promoters.

التفاصيل البيبلوغرافية
العنوان: CDBProm: the Comprehensive Directory of Bacterial Promoters.
المؤلفون: Martinez GS; Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia B3H 4H7, Canada.; Pediatrics, Izaak Walton Killam (IWK) Health Center. Canadian Center for Vaccinology (CCfV), Halifax, Nova Scotia B3H 4H7, Canada.; BioForge Canada Limited, Halifax, Nova Scotia B3N 3B9, Canada., Perez-Rueda E; Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autonóma de México, Unidad Académica del Estado de Yucatán, Mérida 97302, Yucatán, Mexico., Kumar A; Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia B3H 4H7, Canada.; Pediatrics, Izaak Walton Killam (IWK) Health Center. Canadian Center for Vaccinology (CCfV), Halifax, Nova Scotia B3H 4H7, Canada.; BioForge Canada Limited, Halifax, Nova Scotia B3N 3B9, Canada., Dutt M; Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia B3H 4H7, Canada.; Pediatrics, Izaak Walton Killam (IWK) Health Center. Canadian Center for Vaccinology (CCfV), Halifax, Nova Scotia B3H 4H7, Canada.; BioForge Canada Limited, Halifax, Nova Scotia B3N 3B9, Canada., Maya CR; Facultad de Ciencias e Ingeniería, Universidad Nacional Autonoma de Mexico, Mexico City 04510, Mexico., Ledesma-Dominguez L; Instituto de Investigaciones en Matematicas Aplicadas y en Sistemas, Universidad Nacional Autonoma de Mexico, Mexico City 04510, Mexico., Casa PL; Biotechnology Institute, Universidade de Caxias do Sul, Caxias do Sul, Rio Grande do Sul 95070-560, Brazil., Kumar A; Molecular Biology and Biotechnology, Tezpur University, Tezpur, Assam 784028, India., de Avila E Silva S; Biotechnology Institute, Universidade de Caxias do Sul, Caxias do Sul, Rio Grande do Sul 95070-560, Brazil., Kelvin DJ; Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia B3H 4H7, Canada.; Pediatrics, Izaak Walton Killam (IWK) Health Center. Canadian Center for Vaccinology (CCfV), Halifax, Nova Scotia B3H 4H7, Canada.; BioForge Canada Limited, Halifax, Nova Scotia B3N 3B9, Canada.
المصدر: NAR genomics and bioinformatics [NAR Genom Bioinform] 2024 Feb 21; Vol. 6 (1), pp. lqae018. Date of Electronic Publication: 2024 Feb 21 (Print Publication: 2024).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Oxford University Press Country of Publication: England NLM ID: 101756213 Publication Model: eCollection Cited Medium: Internet ISSN: 2631-9268 (Electronic) Linking ISSN: 26319268 NLM ISO Abbreviation: NAR Genom Bioinform Subsets: PubMed not MEDLINE
أسماء مطبوعة: Original Publication: [Oxford] : Oxford University Press, [2019]-
مستخلص: The decreasing cost of whole genome sequencing has produced high volumes of genomic information that require annotation. The experimental identification of promoter sequences, pivotal for regulating gene expression, is a laborious and cost-prohibitive task. To expedite this, we introduce the Comprehensive Directory of Bacterial Promoters (CDBProm), a directory of in-silico predicted bacterial promoter sequences. We first identified that an Extreme Gradient Boosting (XGBoost) algorithm would distinguish promoters from random downstream regions with an accuracy of 87%. To capture distinctive promoter signals, we generated a second XGBoost classifier trained on the instances misclassified in our first classifier. The predictor of CDBProm is then fed with over 55 million upstream regions from more than 6000 bacterial genomes. Upon finding potential promoter sequences in upstream regions, each promoter is mapped to the genomic data of the organism, linking the predicted promoter with its coding DNA sequence, and identifying the function of the gene regulated by the promoter. The collection of bacterial promoters available in CDBProm enables the quantitative analysis of a plethora of bacterial promoters. Our collection with over 24 million promoters is publicly available at https://aw.iimas.unam.mx/cdbprom/.
(© The Author(s) 2024. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.)
References: J Community Genet. 2018 Apr;9(2):103-116. (PMID: 28952070)
Genomics Inform. 2019 Dec;17(4):e44. (PMID: 31896244)
BMC Res Notes. 2011 Jul 22;4:257. (PMID: 21781326)
Comput Chem. 2001 Dec;26(1):51-6. (PMID: 11765852)
J Theor Biol. 2011 Oct 21;287:92-9. (PMID: 21827769)
Nature. 2002 May 9;417(6885):141-7. (PMID: 12000953)
Proc Natl Acad Sci U S A. 2019 Sep 24;116(39):19695-19704. (PMID: 31501343)
Sci Rep. 2023 Jan 31;13(1):1763. (PMID: 36720898)
Comput Struct Biotechnol J. 2022 Sep 09;20:4969-4974. (PMID: 36147675)
Annu Rev Biophys Biomol Struct. 2004;33:415-40. (PMID: 15139820)
Nat Genet. 1999 Apr;21(4):385-9. (PMID: 10192388)
Nucleic Acids Res. 2014 Apr;42(7):4196-207. (PMID: 24476912)
Nucleic Acids Res. 2022 Jul 5;50(W1):W670-W676. (PMID: 35544234)
Nature. 2001 Oct 25;413(6858):852-6. (PMID: 11677609)
Virulence. 2014;5(8):832-4. (PMID: 25603428)
Nucleic Acids Res. 2009 Jan;37(Database issue):D37-40. (PMID: 18805906)
Sci Rep. 2018 Mar 14;8(1):4520. (PMID: 29540741)
Brief Bioinform. 2022 Mar 10;23(2):. (PMID: 35021193)
mSphere. 2020 May 20;5(3):. (PMID: 32434841)
Mol Ther Nucleic Acids. 2019 Sep 6;17:337-346. (PMID: 31299595)
mSystems. 2021 Aug 31;6(4):e0052621. (PMID: 34254822)
Curr Opin Struct Biol. 2014 Apr;25:77-85. (PMID: 24503515)
Elife. 2022 Jan 26;11:. (PMID: 35080492)
J Bacteriol. 2011 Oct;193(19):5593-4. (PMID: 21914895)
Trends Microbiol. 2003 Jun;11(6):248-53. (PMID: 12823939)
BMC Bioinformatics. 2005 Jan 05;6:1. (PMID: 15631638)
Proc Natl Acad Sci U S A. 2021 Jan 12;118(2):. (PMID: 33372147)
Genes Dev. 1999 Aug 15;13(16):2134-47. (PMID: 10465790)
BMC Bioinformatics. 2022 May 10;23(1):171. (PMID: 35538405)
FEBS Open Bio. 2017 Feb 16;7(3):324-334. (PMID: 28286728)
Microbiologyopen. 2021 Oct;10(5):e1230. (PMID: 34713600)
Nucleic Acids Res. 2019 Jan 8;47(D1):D212-D220. (PMID: 30395280)
DNA Res. 2012;19(1):67-79. (PMID: 22193367)
Front Cell Infect Microbiol. 2023 Jun 16;13:1147544. (PMID: 37396305)
EMBO J. 2022 Feb 1;41(3):e108708. (PMID: 34961960)
J Mol Biol. 2021 May 28;433(11):166860. (PMID: 33539888)
Front Genet. 2019 Apr 05;10:286. (PMID: 31024615)
Curr Opin Microbiol. 2004 Apr;7(2):102-8. (PMID: 15063844)
تواريخ الأحداث: Date Created: 20240222 Latest Revision: 20240224
رمز التحديث: 20240224
مُعرف محوري في PubMed: PMC10880602
DOI: 10.1093/nargab/lqae018
PMID: 38385146
قاعدة البيانات: MEDLINE
الوصف
تدمد:2631-9268
DOI:10.1093/nargab/lqae018