Systematic discovery of conservation states for single-nucleotide annotation of the human genome
العنوان: | Systematic discovery of conservation states for single-nucleotide annotation of the human genome |
---|---|
المؤلفون: | Adriana Arneson, Jason Ernst |
المصدر: | Communications Biology Communications Biology, Vol 2, Iss 1, Pp 1-14 (2019) |
بيانات النشر: | Nature Publishing Group UK, 2019. |
سنة النشر: | 2019 |
مصطلحات موضوعية: | Epigenomics, Medicine (miscellaneous), Sequence alignment, Computational biology, Biology, Genome informatics, Genome, Polymorphism, Single Nucleotide, General Biochemistry, Genetics and Molecular Biology, Article, Evolutionary genetics, 03 medical and health sciences, 0302 clinical medicine, Cluster Analysis, Humans, lcsh:QH301-705.5, 030304 developmental biology, Comparative genomics, 0303 health sciences, Human evolutionary genetics, Genome, Human, Nucleotides, Computational Biology, Reproducibility of Results, Molecular Sequence Annotation, Genome project, Gene Annotation, Genomics, Chromatin, Markov Chains, Phenotype, lcsh:Biology (General), Multivariate Analysis, Human genome, General Agricultural and Biological Sciences, 030217 neurology & neurosurgery, Reference genome, Genome-Wide Association Study |
الوصف: | Comparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary elements of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, repeat families, and bases prioritized by various variant prioritization scores. Constrained elements have distinct heritability partitioning enrichments depending on their conservation state assignment. ConsHMM conservation states are a resource for analyzing genomes and genetic variants. Adriana Arneson and Jason Ernst present a computational model, ConsHMM, for learning the conservation states of DNA sequences based on a multiple-species alignment. They apply ConsHMM to a 100-way vertebrate sequence alignment to provide single nucleotide annotations of conservation states of the human genome. |
اللغة: | English |
تدمد: | 2399-3642 |
URL الوصول: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8334c1c4991cbcff96123d9f6be2cab4 http://europepmc.org/articles/PMC6606595 |
حقوق: | OPEN |
رقم الأكسشن: | edsair.doi.dedup.....8334c1c4991cbcff96123d9f6be2cab4 |
قاعدة البيانات: | OpenAIRE |
تدمد: | 23993642 |
---|