دورية أكاديمية

Regulatory activity is the default DNA state in eukaryotes.

التفاصيل البيبلوغرافية
العنوان: Regulatory activity is the default DNA state in eukaryotes.
المؤلفون: Luthra I; School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada., Jensen C; School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada., Chen XE; School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada., Salaudeen AL; School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada., Rafi AM; School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada., de Boer CG; School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada. carl.deboer@ubc.ca.
المصدر: Nature structural & molecular biology [Nat Struct Mol Biol] 2024 Mar; Vol. 31 (3), pp. 559-567. Date of Electronic Publication: 2024 Mar 06.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Nature Pub. Group Country of Publication: United States NLM ID: 101186374 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1545-9985 (Electronic) Linking ISSN: 15459985 NLM ISO Abbreviation: Nat Struct Mol Biol Subsets: MEDLINE
أسماء مطبوعة: Original Publication: New York : Nature Pub. Group, c2004-
مواضيع طبية MeSH: Saccharomyces cerevisiae*/genetics , Transcriptome*, Humans ; Genome ; DNA ; Chromatin
مستخلص: Genomes encode for genes and non-coding DNA, both capable of transcriptional activity. However, unlike canonical genes, many transcripts from non-coding DNA have limited evidence of conservation or function. Here, to determine how much biological noise is expected from non-genic sequences, we quantify the regulatory activity of evolutionarily naive DNA using RNA-seq in yeast and computational predictions in humans. In yeast, more than 99% of naive DNA bases were transcribed. Unlike the evolved transcriptome, naive transcripts frequently overlapped with opposite sense transcripts, suggesting selection favored coherent gene structures in the yeast genome. In humans, regulation-associated chromatin activity is predicted to be common in naive dinucleotide-content-matched randomized DNA. Here, naive and evolved DNA have similar co-occurrence and cell-type specificity of chromatin marks, challenging these as indicators of selection. However, in both yeast and humans, extreme high activities were rare in naive DNA, suggesting they result from selection. Overall, basal regulatory activity seems to be the default, which selection can hone to evolve a function or, if detrimental, repress.
(© 2024. The Author(s), under exclusive licence to Springer Nature America, Inc.)
References: Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011). (PMID: 21890647318596410.1101/gad.17446611)
Hangauer, M. J., Vaughn, I. W. & McManus, M. T. Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs. PLoS Genet. 9, e1003569 (2013). (PMID: 23818866368851310.1371/journal.pgen.1003569)
Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012). (PMID: 22955988343149310.1101/gr.132159.111)
Ramos, A. D. et al. Integration of genome-wide approaches identifies lncRNAs of adult neural stem cells and their progeny in vivo. Cell Stem Cell 12, 616–628 (2013). (PMID: 23583100366280510.1016/j.stem.2013.03.003)
Hon, C.-C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017). (PMID: 28241135685718210.1038/nature21374)
Ponting, C. P. & Haerty, W. Genome-wide analysis of human long noncoding RNAs: a provocative review. Annu Rev. Genomics Hum. Genet 123, 153–172(2022). (PMID: 10.1146/annurev-genom-112921-123710)
Palazzo, A. F. & Lee, E. S. Non-coding RNA: what is functional and what is junk? Front. Genet. 6, 2 (2015). (PMID: 25674102430630510.3389/fgene.2015.00002)
Pertea, M. et al. CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise. Genome Biol. 19, 208 (2018). (PMID: 30486838626075610.1186/s13059-018-1590-2)
Chen, J. et al. Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs. Genome Biol. 17, 19 (2016). (PMID: 26838501473932510.1186/s13059-016-0880-9)
Dinger, M. E., Amaral, P. P., Mercer, T. R. & Mattick, J. S. Pervasive transcription of the eukaryotic genome: functional indices and conceptual implications. Brief. Funct. Genomic Proteomic 8, 407–423 (2009). (PMID: 1977020410.1093/bfgp/elp038)
Ulitsky, I. & Bartel, D. P. lincRNAs: genomics, evolution, and mechanisms. Cell 154, 26–46 (2013). (PMID: 23827673392478710.1016/j.cell.2013.06.020)
Mercer, T. R., Dinger, M. E. & Mattick, J. S. Long non-coding RNAs: insights into functions. Nat. Rev. Genet. 10, 155–159 (2009). (PMID: 1918892210.1038/nrg2521)
Fernandes, J. C. R., Acuña, S. M., Aoki, J. I., Floeter-Winter, L. M. & Muxel, S. M. Long non-coding RNAs in the regulation of gene expression: physiology and disease. Noncoding RNA 5, 17 (2019). (PMID: 307815886468922)
Uszczynska-Ratajczak, B., Lagarde, J., Frankish, A., Guigó, R. & Johnson, R. Towards a complete map of the human long non-coding RNA transcriptome. Nat. Rev. Genet. 19, 535–548 (2018). (PMID: 29795125645196410.1038/s41576-018-0017-y)
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). (PMID: 10.1038/nature11247)
Ponting, C. P. & Hardison, R. C. What fraction of the human genome is functional? Genome Res. 21, 1769–1776 (2011). (PMID: 21875934320556210.1101/gr.116814.110)
Graur, D. An upper limit on the functional fraction of the human genome. Genome Biol. Evol. 9, 1880–1885 (2017). (PMID: 28854598557003510.1093/gbe/evx121)
Struhl, K. Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nat. Struct. Mol. Biol. 14, 103–105 (2007). (PMID: 1727780410.1038/nsmb0207-103)
Robinson, R. Dark matter transcripts: sound and fury, signifying nothing? PLoS Biol. 8, e1000370 (2010). (PMID: 20502697287267210.1371/journal.pbio.1000370)
Eddy, S. R. The ENCODE project: missteps overshadowing a success. Curr. Biol. 23, R259–R261 (2013). (PMID: 2357886710.1016/j.cub.2013.03.023)
Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013). (PMID: 2333276410.1016/j.cell.2012.12.009)
Nutiu, R. et al. Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument. Nat. Biotechnol. 29, 659–664 (2011). (PMID: 21706015313463710.1038/nbt.1882)
Yona, A. H., Alm, E. J. & Gore, J. Random sequences rapidly evolve into de novo promoters. Nat. Commun. 12, 604 (2021).
Vaishnav, E. D. et al. The evolution, evolvability and engineering of gene regulatory DNA. Nature 603, 455–463 (2022). (PMID: 3526479710.1038/s41586-022-04506-6)
de Boer, C. G. et al. Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat. Biotechnol. 38, 56–65 (2020). (PMID: 3179240710.1038/s41587-019-0315-8)
Sahu, B. et al. Sequence determinants of human gene regulatory elements. Nat. Genet. 54, 283–294 (2022). (PMID: 35190730892089110.1038/s41588-021-01009-4)
White, M. A., Myers, C. A., Corbo, J. C. & Cohen, B. A. Massively parallel in vivo enhancer assay reveals that highly local features determine the cis-regulatory function of ChIP–seq peaks. Proc. Natl Acad. Sci. USA 110, 11952–11957 (2013). (PMID: 23818646371814310.1073/pnas.1307449110)
Galupa, R. et al. Enhancer architecture and chromatin accessibility constrain phenotypic space during Drosophila development. Dev. Cell 58, 51–62.e4 (2023). (PMID: 36626871986017310.1016/j.devcel.2022.12.003)
Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5′ untranslated regions from 500,000 random sequences. Genome Res. 27, 2015–2024 (2017). (PMID: 29097404574105210.1101/gr.224964.117)
Sample, P. J. et al. Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803–809 (2019). (PMID: 31267113710013310.1038/s41587-019-0164-5)
Bogard, N., Linder, J., Rosenberg, A. B. & Seelig, G. A deep neural network for predicting and engineering alternative polyadenylation. Cell 178, 91–106 (2019). (PMID: 31178116659957510.1016/j.cell.2019.04.046)
Rosenberg, A. B., Patwardhan, R. P., Shendure, J. & Seelig, G. Learning the sequence determinants of alternative splicing from millions of random sequences. Cell 163, 698–711 (2015). (PMID: 2649660910.1016/j.cell.2015.09.054)
de Boer, C. G. et al. A unified model for yeast transcript definition. Genome Res. 24, 154–166 (2014). (PMID: 24170600387585710.1101/gr.164327.113)
Gvozdenov, Z., Barcutean, Z. & Struhl, K. Functional analysis of a random-sequence chromosome reveals a high level and the molecular nature of transcriptional noise in yeast cells. Mol. Cell. 83, 1786–1797.e5 (2023). (PMID: 3713730210.1016/j.molcel.2023.04.010)
Zhou, J. et al. Exogenous artificial DNA forms chromatin structure with active transcription in yeast. Sci. China Life Sci. 65, 851–860 (2022). (PMID: 3497071110.1007/s11427-021-2044-x)
Scherer, S. W. et al. Human chromosome 7: DNA sequence and biology. Science 300, 767–772 (2003). (PMID: 12690205288296110.1126/science.1083423)
Parfrey, L. W., Lahr, D. J. G., Knoll, A. H. & Katz, L. A. Estimating the timing of early eukaryotic diversification with multigene molecular clocks. Proc. Natl Acad. Sci. USA 108, 13624–13629 (2011). (PMID: 21810989315818510.1073/pnas.1110633108)
Eme, L., Sharpe, S. C., Brown, M. W. & Roger, A. J. On the age of eukaryotes: evaluating evidence from fossils and molecular clocks. Cold Spring Harb. Perspect. Biol. 6, a016139 (2014). (PMID: 25085908410798810.1101/cshperspect.a016139)
Smale, S. T. & Kadonaga, J. T. The RNA polymerase II core promoter. Annu. Rev. Biochem. 72, 449–479 (2003). (PMID: 1265173910.1146/annurev.biochem.72.121801.161520)
Ulbricht, R. J. & Olivas, W. M. Puf1p acts in combination with other yeast Puf proteins to control mRNA stability. RNA 14, 246–262 (2008). (PMID: 18094119221224510.1261/rna.847408)
Schirman, D., Yakhini, Z., Pilpel, Y. & Dahan, O. A broad analysis of splicing regulation in yeast using a large library of synthetic introns. PLoS Genet. 17, e1009805 (2021). (PMID: 34570750849684510.1371/journal.pgen.1009805)
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021). (PMID: 34608324849015210.1038/s41592-021-01252-x)
Karollus, A., Mauermeier, T. & Gagneur, J. Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers. Genome Biol. 24, 56 (2023). (PMID: 369738061004563010.1186/s13059-023-02899-9)
Kimura, H. Histone modifications for human epigenome analysis. J. Hum. Genet 58, 439–445 (2013). (PMID: 2373912210.1038/jhg.2013.66)
Karlin, S. Global dinucleotide signatures and analysis of genomic heterogeneity. Curr. Opin. Microbiol. 1, 598–610 (1998). (PMID: 1006652210.1016/S1369-5274(98)80095-7)
Mariño-Ramírez, L., Spuge, J. L., Kanga, G. C. & Landsman, D. Statistical analysis of over-represented words in human promoter sequences. Nucleic Acids Res. 32, 5972 (2004). (PMID: 10.1093/nar/gkh938)
Bird, A. P. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8, 1499–1504 (1980). (PMID: 625393832401210.1093/nar/8.7.1499)
Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020). (PMID: 3243397210.1016/j.celrep.2020.107663)
Holoch, D. & Margueron, R. Mechanisms regulating PRC2 recruitment and enzymatic activity. Trends Biochem. Sci. 42, 531–542 (2017). (PMID: 2848337510.1016/j.tibs.2017.04.003)
Malik, H. S. & Henikoff, S. Phylogenomics of the nucleosome. Nat. Struct. Biol. 10, 882–891 (2003). (PMID: 1458373810.1038/nsb996)
Kimura, M. Evolutionary rate at the molecular level. Nature 217, 624–626 (1968). (PMID: 563773210.1038/217624a0)
Tenesa, A. et al. Recent human effective population size estimated from linkage disequilibrium. Genome Res. 17, 520–526 (2007). (PMID: 17351134183209910.1101/gr.6023607)
Sherry, S. T., Harpending, H. C., Batzer, M. A. & Stoneking, M. Alu evolution in human populations: using the coalescent to estimate effective population size. Genetics 147, 1977–1982 (1997). (PMID: 9409852120836210.1093/genetics/147.4.1977)
Hawks, J. In Recent Advances in Palaeodemography: Data, Techniques, Patterns (ed. Bocquet-Appel, J.-P.) 9–30 (Springer, 2008).
Tsai, I. J., Bensasson, D., Burt, A. & Koufopanou, V. Population genomics of the wild yeast Saccharomyces paradoxus: quantifying the life cycle. Proc. Natl Acad. Sci. USA 105, 4957–4962 (2008). (PMID: 18344325229079810.1073/pnas.0707314105)
Huang, Y.-F. & Niu, D.-K. Evidence against the energetic cost hypothesis for the short introns in highly expressed genes. BMC Evol. Biol. 8, 154 (2008). (PMID: 18492248242403610.1186/1471-2148-8-154)
Palazzo, A. F. & Gregory, T. R. The case for junk DNA. PLoS Genet. 10, e1004351 (2014). (PMID: 24809441401442310.1371/journal.pgen.1004351)
Schulz, D. et al. Transcriptome surveillance by selective termination of noncoding RNA synthesis. Cell 155, 1075–1087 (2013). (PMID: 2421091810.1016/j.cell.2013.10.024)
de Boer, C. Mechanisms of Yeast Gene Definition (University of Toronto, 2014).
Emera, D., Yin, J., Reilly, S. K., Gockley, J. & Noonan, J. P. Origin and evolution of developmental enhancers in the mammalian neocortex. Proc. Natl Acad. Sci. USA 113, E2617–E2626 (2016). (PMID: 27114548486843110.1073/pnas.1603718113)
Oss, S. B. V. & Carvunis, A.-R. De novo gene birth. PLoS Genet. 15, e1008160 (2019). (PMID: 31120894654219510.1371/journal.pgen.1008160)
Weisman, C. M. & Eddy, S. R. Gene evolution: getting something from nothing. Curr. Biol. 27, R661–R663 (2017). (PMID: 2869736810.1016/j.cub.2017.05.056)
Keefe, A. D. & Szostak, J. W. Functional proteins from a random-sequence library. Nature 410, 715–718 (2001). (PMID: 11287961447632110.1038/35070613)
Blevins, W. R. et al. Uncovering de novo gene birth in yeast using deep transcriptomics. Nat. Commun. 12, 604 (2021). (PMID: 33504782784116010.1038/s41467-021-20911-3)
Hall, C., Brachat, S. & Dietrich, F. S. Contribution of horizontal gene transfer to the evolution of Saccharomyces cerevisiae. Eukaryot. Cell 4, 1102–1115 (2005). (PMID: 15947202115199510.1128/EC.4.6.1102-1115.2005)
Keeling, P. J. & Palmer, J. D. Horizontal gene transfer in eukaryotic evolution. Nat. Rev. Genet. 9, 605–618 (2008). (PMID: 1859198310.1038/nrg2386)
Fitzpatrick, D. A. Horizontal gene transfer in fungi. FEMS Microbiol. Lett. 329, 1–8 (2012). (PMID: 2211223310.1111/j.1574-6968.2011.02465.x)
Camellato, B. R., Brosh, R., Ashe, H. J., Maurano, M. T. & Boeke, J. D. Synthetic reversed sequences reveal default genomic states. Nature (in the press).
Landt, S. G. et al. ChIP–seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012). (PMID: 22955991343149610.1101/gr.136184.111)
Jung, Y. L. et al. Impact of sequencing depth in ChIP–seq experiments. Nucleic Acids Res. 42, e74 (2014). (PMID: 24598259402719910.1093/nar/gku178)
de Boer, C. G. & Taipale, J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 625, 41–50 (2024). (PMID: 3809301810.1038/s41586-023-06661-w)
Scherer, S. W., Tompkins, B. J. F. & Tsui, L.-C. A human chromosome 7-specific genomic DNA library in yeast artificial chromosomes. Mamm. Genome 3, 179–181 (1992). (PMID: 161722510.1007/BF00352464)
Blackburn:Yeast Colony PCR v2.0. OpenWetWare https://openwetware.org/wiki/Blackburn:Yeast_Colony_PCR_v2.0.
Kunz, J. et al. Regional localization of 725 human chromosome 7-specific yeast artificial chromosome clones. Genomics 22, 439–448 (1994). (PMID: 780623210.1006/geno.1994.1407)
Stuecker, T. RNA Isolation from Yeast. protocols.io https://www.protocols.io/view/rna-isolation-from-yeast-inwcdfe (2017).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014). (PMID: 24695404410359010.1093/bioinformatics/btu170)
Andrews, S. FastQC: a quality control tool for high throughput sequence data. (2010).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). (PMID: 2310488610.1093/bioinformatics/bts635)
Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010). (PMID: 20436464314604310.1038/nbt.1621)
Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016). (PMID: 27079975498787610.1093/nar/gkw257)
Bailey, T. L. & Grant, C. E. SEA: simple enrichment analysis of motifs. Preprint at bioRxiv https://doi.org/10.1101/2021.08.23.457422 (2021).
de Boer, C. G. & Hughes, T. R. YeTFaSCo: a database of evaluated yeast transcription factor sequence specificities. Nucleic Acids Res. 40, D169–D179 (2012). (PMID: 2210257510.1093/nar/gkr993)
Piovesan, A. et al. On the length, weight and GC content of the human genome. BMC Res. Notes 12, 106 (2019). (PMID: 30813969639178010.1186/s13104-019-4137-z)
Khan, A., Riudavets Puig, R., Boddie, P. & Mathelier, A. BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences. Bioinformatics 37, 1607–1609 (2021). (PMID: 3313576410.1093/bioinformatics/btaa928)
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Weirauch, M. T. et al. Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 1431–1443 (2014). (PMID: 25215497416304110.1016/j.cell.2014.08.009)
المشرفين على المادة: 9007-49-2 (DNA)
0 (Chromatin)
تواريخ الأحداث: Date Created: 20240306 Date Completed: 20240320 Latest Revision: 20240323
رمز التحديث: 20240323
DOI: 10.1038/s41594-024-01235-4
PMID: 38448573
قاعدة البيانات: MEDLINE
الوصف
تدمد:1545-9985
DOI:10.1038/s41594-024-01235-4