دورية أكاديمية

A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics.

التفاصيل البيبلوغرافية
العنوان: A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics.
المؤلفون: Eloe-Fadrosh EA; Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA. eaeloefadrosh@lbl.gov., Mungall CJ; Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., Miller MA; Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., Smith M; Pacific Northwest National Laboratory, Richland, WA, USA., Patil SS; Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., Kelliher JM; Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA., Johnson LYD; Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA., Rodriguez FE; Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA., Chain PSG; Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA., Hu B; Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA., Thornton MB; Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., McCue LA; Pacific Northwest National Laboratory, Richland, WA, USA., McHardy AC; Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Braunschweig, Germany., Harris NL; Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., Reddy TBK; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., Mukherjee S; DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA., Hunter CI; GigaScience Press, Hong Kong Science Park, Pak Shek Kok, New Territories, Hong Kong., Walls R; Critical Path Institute, Tucson, AZ, USA., Schriml LM; University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA.
المصدر: Methods in molecular biology (Clifton, N.J.) [Methods Mol Biol] 2024; Vol. 2802, pp. 587-609.
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Humana Press Country of Publication: United States NLM ID: 9214969 Publication Model: Print Cited Medium: Internet ISSN: 1940-6029 (Electronic) Linking ISSN: 10643745 NLM ISO Abbreviation: Methods Mol Biol Subsets: MEDLINE
أسماء مطبوعة: Publication: Totowa, NJ : Humana Press
Original Publication: Clifton, N.J. : Humana Press,
مواضيع طبية MeSH: Metagenomics*/methods , Metagenomics*/standards , Genomics*/methods , Genomics*/standards , Metagenome*/genetics, Databases, Genetic ; Soil Microbiology
مستخلص: Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.
(© 2024. The Author(s).)
References: Wilkinson MD, Dumontier M, Aalbersberg IJJ et al (2016) The FAIR guiding principles for scientific data management and stewardship. Sci Data 3:160018. https://doi.org/10.1038/sdata.2016.18. (PMID: 10.1038/sdata.2016.18269782444792175)
Field D, Amaral-Zettler L, Cochrane G et al (2011) The genomic standards consortium. PLoS Biol 9:e1001088. https://doi.org/10.1371/journal.pbio.1001088. (PMID: 10.1371/journal.pbio.1001088217130303119656)
Yilmaz P, Kottmann R, Field D et al (2011) Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat Biotechnol 29:415–420. https://doi.org/10.1038/nbt.1823. (PMID: 10.1038/nbt.1823215522443367316)
Buttigieg PL, Pafilis E, Lewis SE et al (2016) The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. J Biomed Semant 7:57. https://doi.org/10.1186/s13326-016-0097-6. (PMID: 10.1186/s13326-016-0097-6)
Mungall CJ, Torniai C, Gkoutos GV et al (2012) Uberon, an integrative multi-species anatomy ontology. Genome Biol 13:R5. https://doi.org/10.1186/gb-2012-13-1-r5. (PMID: 10.1186/gb-2012-13-1-r5222935523334586)
Huttenhower C, Finn RD, McHardy AC (2023) Challenges and opportunities in sharing microbiome data and analyses. Nat Microbiol. https://doi.org/10.1038/s41564-023-01484-x.
Kyrpides NC, Eloe-Fadrosh EA, Ivanova NN (2016) Microbiome data science: understanding our microbial planet. Trends Microbiol 24:425–427. https://doi.org/10.1016/j.tim.2016.02.011. (PMID: 10.1016/j.tim.2016.02.01127197692)
Almeida A, Nayfach S, Boland M et al (2021) A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat Biotechnol 39:105–114. https://doi.org/10.1038/s41587-020-0603-3. (PMID: 10.1038/s41587-020-0603-332690973)
Forster SC, Kumar N, Anonye BO et al (2019) A human gut bacterial genome and culture collection for improved metagenomic analyses. Nat Biotechnol 37:186–192. https://doi.org/10.1038/s41587-018-0009-7. (PMID: 10.1038/s41587-018-0009-7307188696785715)
Seshadri R, Leahy SC, Attwood GT et al (2018) Cultivation and sequencing of rumen microbiome members from the Hungate1000 collection. Nat Biotechnol 36:359–367. https://doi.org/10.1038/nbt.4110. (PMID: 10.1038/nbt.4110295535756118326)
Choi J, Yang F, Stepanauskas R et al (2017) Strategies to improve reference databases for soil microbiomes. ISME J 11:829–834. https://doi.org/10.1038/ismej.2016.168. (PMID: 10.1038/ismej.2016.16827935589)
Woodcroft BJ, Singleton CM, Boyd JA et al (2018) Genome-centric view of carbon processing in thawing permafrost. Nature 560:49–54. https://doi.org/10.1038/s41586-018-0338-1. (PMID: 10.1038/s41586-018-0338-130013118)
A functional microbiome catalog crowdsourced from North American rivers. https://doi.org/10.1101/2023.07.22.550117.
Sunagawa S, Acinas SG, Bork P et al (2020) Tara Oceans: towards global ocean ecosystems biology. Nat Rev Microbiol 18:428–445. https://doi.org/10.1038/s41579-020-0364-5. (PMID: 10.1038/s41579-020-0364-532398798)
Arita M, Karsch-Mizrachi I, Cochrane G (2021) The international nucleotide sequence database collaboration. Nucleic Acids Res 49:D121–D124. https://doi.org/10.1093/nar/gkaa967. (PMID: 10.1093/nar/gkaa96733166387)
Eloe-Fadrosh EA, Ahmed F, Anubhav A et al (2021) The National Microbiome Data Collaborative Data Portal: an integrated multi-omics microbiome data resource. Nucleic Acids Res 50:D828–D836. https://doi.org/10.1093/nar/gkab990. (PMID: 10.1093/nar/gkab9908958897)
Mukherjee S, Stamatis D, Li CT et al (2023) Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9. Nucleic Acids Res 51:D957–D963. https://doi.org/10.1093/nar/gkac974. (PMID: 10.1093/nar/gkac97436318257)
McMurry JA, Juty N, Blomberg N et al (2017) Identifiers for the 21st century: how to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data. PLoS Biol 15:e2001414. https://doi.org/10.1371/journal.pbio.2001414. (PMID: 10.1371/journal.pbio.2001414286620645490878)
Bowers RM, Kyrpides NC, Stepanauskas R et al (2017) Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat Biotechnol 35:725–731. https://doi.org/10.1038/nbt.3893. (PMID: 10.1038/nbt.3893287874246436528)
Roux S, Adriaenssens EM, Dutilh BE et al (2019) Minimum information about an uncultivated virus genome (MIUViG). Nat Biotechnol 37:29–37. https://doi.org/10.1038/nbt.4306. (PMID: 10.1038/nbt.430630556814)
Field D, Garrity G, Gray T et al (2008) The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 26:541–547. https://doi.org/10.1038/nbt1360. (PMID: 10.1038/nbt1360184647872409278)
Food and Agriculture Organization of the United Nations (2018) World reference base for soil resources 2014: International soil classification system for naming soils and creating legends for soil maps - update 2015. Food & Agriculture Org.
Hoyt CT, Balk M, Callahan TJ, Domingo-Fernández D (2022) Unifying the identification of biomedical entities with the bioregistry. Sci Data 9:714. (PMID: 10.1038/s41597-022-01807-3364028389675740)
Jackson RC, Matentzoglu N, Overton JA et al (2021) OBO foundry in 2021: operationalizing open data principles to evaluate ontologies. bioRxiv 2021.06.01.446587.
Whetzel PL, Noy NF, Shah NH et al (2011) BioPortal: enhanced functionality via new web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res 39:W541–W545. https://doi.org/10.1093/nar/gkr469. (PMID: 10.1093/nar/gkr469216729563125807)
Jupp S, Burdett T, Malone J, et al A new ontology lookup service at EMBL-EBI. http://ceur-ws.org/Vol-1546/paper_29.pdf . Accessed 3 Jan 2023.
Ong E, Xiang Z, Zhao B et al (2016) Ontobee: a linked ontology data server to support ontology term dereferencing, linkage, query and integration. Nucleic Acids Res gkw918. https://doi.org/10.1093/nar/gkw918.
Jonquet C, Poveda-Villalon M (2023) About versioning ontologies or any digital objects with clear semantics.
Moxon S, Solbrig H, Unni D et al (2021) The linked data modeling language (LinkML): a general-purpose data modeling framework grounded in machine-readable semantics. In: 2021 international conference on biomedical ontologies, ICBO 2021. CEUR-WS, pp 148–151.
Gill IS, Griffiths EJ, Dooley D et al (2023) The DataHarmonizer: a tool for faster data harmonization, validation, aggregation and analysis of pathogen genomics contextual information. Microb Genom 9. https://doi.org/10.1099/mgen.0.000908.
DiGiuseppe N, Pouchard LC, Noy NF (2014) SWEET ontology coverage for earth system sciences. Earth Sci Inf 7:249–264. https://doi.org/10.1007/s12145-013-0143-1. (PMID: 10.1007/s12145-013-0143-1)
Metabolomics. In: SpringerLink. https://link.springer.com/journal/11306/volumes-and-issues/3-3 . Accessed 18 Oct 2023.
Taylor CF, Paton NW, Lilley KS et al (2007) The minimum information about a proteomics experiment (MIAPE). Nat Biotechnol 25:887–893. https://doi.org/10.1038/nbt1329. (PMID: 10.1038/nbt132917687369)
Spicer RA, Salek R, Steinbeck C (2017) A decade after the metabolomics standards initiative it’s time for a revision. Sci Data 4:170138. (PMID: 10.1038/sdata.2017.138299895946038898)
Kodra D, Pousinis P, Vorkas PA et al (2022) Is current practice adhering to guidelines proposed for metabolite identification in LC-MS untargeted metabolomics? A meta-analysis of the literature. J Proteome Res 21:590–598. https://doi.org/10.1021/acs.jproteome.1c00841. (PMID: 10.1021/acs.jproteome.1c0084134928621)
Kelliher JM, Rudolph M, Vangay P et al (2023) Cohort-based learning for microbiome research community standards. Nat Microbiol 8:751–753. https://doi.org/10.1038/s41564-023-01361-7. (PMID: 10.1038/s41564-023-01361-737069400)
Matentzoglu N, Balhoff JP, Bello SM et al (2022) A simple standard for sharing ontological mappings (SSSOM). Database 2022. https://doi.org/10.1093/database/baac035.
Wieczorek J, Bloom D, Guralnick R et al (2012) Darwin Core: an evolving community-developed biodiversity data standard. PLoS One 7:e29715. https://doi.org/10.1371/journal.pone.0029715. (PMID: 10.1371/journal.pone.0029715222386403253084)
Meyer R, Appeltans W, Duncan WD et al (2023) Aligning standards communities for omics biodiversity data: sustainable Darwin Core-MIxS interoperability. Biodivers Data J 11:e112420. https://doi.org/10.3897/BDJ.11.e112420. (PMID: 10.3897/BDJ.11.e1124203782929410565567)
Rehm HL, Page AJH, Smith L et al (2021) GA4GH: International policies and standards for data sharing across genomic research and healthcare. Cell Genom 1. https://doi.org/10.1016/j.xgen.2021.100029.
Jacobsen JOB, Baudis M, Baynam GS et al (2022) The GA4GH Phenopacket schema defines a computable representation of clinical data. Nat Biotechnol 40:817–820. https://doi.org/10.1038/s41587-022-01357-4. (PMID: 10.1038/s41587-022-01357-4357057169363006)
Kottmann R, Gray T, Murphy S et al (2008) A standard MIGS/MIMS compliant XML schema: toward the development of the Genomic Contextual Data Markup Language (GCDML). OMICS 12:115–121. https://doi.org/10.1089/omi.2008.0A10. (PMID: 10.1089/omi.2008.0A1018479204)
Schriml LM, Munro JB, Schor M et al (2022) The human disease ontology 2022 update. Nucleic Acids Res 50:D1255–D1261. https://doi.org/10.1093/nar/gkab1063. (PMID: 10.1093/nar/gkab106334755882)
Hastings J, Owen G, Dekker A et al (2016) ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res 44:D1214–D1219. https://doi.org/10.1093/nar/gkv1031. (PMID: 10.1093/nar/gkv103126467479)
Cooper L, Jaiswal P (2016) The plant ontology: a tool for plant genomics. Methods Mol Biol 1374:89–114. https://doi.org/10.1007/978-1-4939-3167-5_5. (PMID: 10.1007/978-1-4939-3167-5_526519402)
Bandrowski A, Brinkman R, Brochhausen M et al (2016) The ontology for biomedical investigations. PLoS One 11:e0154556. https://doi.org/10.1371/journal.pone.0154556. (PMID: 10.1371/journal.pone.0154556271283194851331)
Malone J, Holloway E, Adamusiak T et al (2010) Modeling sample variables with an experimental factor ontology. Bioinformatics 26:1112–1118. https://doi.org/10.1093/bioinformatics/btq099. (PMID: 10.1093/bioinformatics/btq099202000092853691)
Dooley DM, Griffiths EJ, Gosal GS et al (2018) FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration. NPJ Sci Food 2:23. https://doi.org/10.1038/s41538-018-0032-6. (PMID: 10.1038/s41538-018-0032-6313042726550238)
Köhler S, Gargano M, Matentzoglu N et al (2021) The human phenotype ontology in 2021. Nucleic Acids Res 49:D1207–D1217. https://doi.org/10.1093/nar/gkaa1043. (PMID: 10.1093/nar/gkaa104333264411)
فهرسة مساهمة: Keywords: Genome; Metadata; Metagenome; Schema; Validation; Standards
تواريخ الأحداث: Date Created: 20240531 Date Completed: 20240531 Latest Revision: 20240531
رمز التحديث: 20240531
DOI: 10.1007/978-1-0716-3838-5_20
PMID: 38819573
قاعدة البيانات: MEDLINE
الوصف
تدمد:1940-6029
DOI:10.1007/978-1-0716-3838-5_20