New targets acquired: improving locus recovery from the Angiosperms353 probe set

التفاصيل البيبلوغرافية
العنوان: New targets acquired: improving locus recovery from the Angiosperms353 probe set
المؤلفون: Lars Nauheimer, Nick Weigner, Joanne L. Birch, Elizabeth M. Joyce, Jennifer A. Tate, Weixuan Ning, Lalita Simpson, Todd G. B. McLay, Chris J. Jackson, William J. Baker, Félix Forest, Bee F. Gunn, Alexander N. Schmidt-Lebuhn
بيانات النشر: Cold Spring Harbor Laboratory, 2020.
سنة النشر: 2020
مصطلحات موضوعية: Locus (genetics), Computational biology, Biology, Target enrichment
الوصف: Universal target enrichment kits maximise utility across wide evolutionary breadth while minimising the number of baits required to create a cost-efficient kit. Locus assembly requires a target reference, but the taxonomic breadth of the kit means that target references files can be phylogenetically sparse. The Angiosperms353 kit has been successfully used to capture loci throughout angiosperms but includes sequence information from 6–18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, reducing locus recovery. We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a ‘mega353’ target file, with each gene represented by 17–373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene datasets. Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 31%, increased loci recovery at 75% length by 61.9%, and increased the total length of the concatenated loci by 30%. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::4fa12dc82f1c470556f84dea017b0a1e
https://doi.org/10.1101/2020.10.04.325571
حقوق: OPEN
رقم الأكسشن: edsair.doi...........4fa12dc82f1c470556f84dea017b0a1e
قاعدة البيانات: OpenAIRE