Systematic discovery of conservation states for single-nucleotide annotation of the human genome

التفاصيل البيبلوغرافية
العنوان: Systematic discovery of conservation states for single-nucleotide annotation of the human genome
المؤلفون: Adriana Arneson, Jason Ernst
المصدر: Communications Biology
Communications Biology, Vol 2, Iss 1, Pp 1-14 (2019)
بيانات النشر: Nature Publishing Group UK, 2019.
سنة النشر: 2019
مصطلحات موضوعية: Epigenomics, Medicine (miscellaneous), Sequence alignment, Computational biology, Biology, Genome informatics, Genome, Polymorphism, Single Nucleotide, General Biochemistry, Genetics and Molecular Biology, Article, Evolutionary genetics, 03 medical and health sciences, 0302 clinical medicine, Cluster Analysis, Humans, lcsh:QH301-705.5, 030304 developmental biology, Comparative genomics, 0303 health sciences, Human evolutionary genetics, Genome, Human, Nucleotides, Computational Biology, Reproducibility of Results, Molecular Sequence Annotation, Genome project, Gene Annotation, Genomics, Chromatin, Markov Chains, Phenotype, lcsh:Biology (General), Multivariate Analysis, Human genome, General Agricultural and Biological Sciences, 030217 neurology & neurosurgery, Reference genome, Genome-Wide Association Study
الوصف: Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants.
Adriana Arneson and Jason Ernst present a computational model, ConsHMM, for learning the conservation states of DNA sequences based on a multiple-species alignment. They apply ConsHMM to a 100-way vertebrate sequence alignment to provide single nucleotide annotations of conservation states of the human genome.
اللغة: English
تدمد: 2399-3642
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8334c1c4991cbcff96123d9f6be2cab4
http://europepmc.org/articles/PMC6606595
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....8334c1c4991cbcff96123d9f6be2cab4
قاعدة البيانات: OpenAIRE