CrowdVariant: a crowdsourcing approach to classify copy number variants

التفاصيل البيبلوغرافية
العنوان: CrowdVariant: a crowdsourcing approach to classify copy number variants
المؤلفون: Peyton, Greenside, Justin, Zook, Marc, Salit, Madeleine, Cule, Ryan, Poplin, Mark, DePristo
المصدر: Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 24
سنة النشر: 2019
مصطلحات موضوعية: DNA Copy Number Variations, Genome, Human, Computational Biology, Crowdsourcing, High-Throughput Nucleotide Sequencing, Humans, Genomics, Sequence Analysis, DNA, Algorithms, Data Curation
الوصف: Copy number variants (CNVs) are an important type of genetic variation that play a causal role in many diseases. The ability to identify high quality CNVs is of substantial clinical relevance. However, CNVs are notoriously difficult to identify accurately from array-based methods and next-generation sequencing (NGS) data, particularly for small (10kbp) CNVs. Manual curation by experts widely remains the gold standard but cannot scale with the pace of sequencing, particularly in fast-growing clinical applications. We present the first proof-of-principle study demonstrating high throughput manual curation of putative CNVs by non-experts. We developed a crowdsourcing framework, called CrowdVariant, that leverages Google's high-throughput crowdsourcing platform to create a high confidence set of deletions for NA24385 (NIST HG002/RM 8391), an Ashkenazim reference sample developed in partnership with the Genome In A Bottle (GIAB) Consortium. We show that non-experts tend to agree both with each other and with experts on putative CNVs. We show that crowdsourced non-expert classifications can be used to accurately assign copy number status to putative CNV calls and identify 1,781 high confidence deletions in a reference sample. Multiple lines of evidence suggest these calls are a substantial improvement over existing CNV callsets and can also be useful in benchmarking and improving CNV calling algorithms. Our crowdsourcing methodology takes the first step toward showing the clinical potential for manual curation of CNVs at scale and can further guide other crowdsourcing genomics applications.
تدمد: 2335-6936
URL الوصول: https://explore.openaire.eu/search/publication?articleId=pmid________::3a1cebafa171c57ddcf232f5b3b9d408
https://pubmed.ncbi.nlm.nih.gov/30864325
رقم الأكسشن: edsair.pmid..........3a1cebafa171c57ddcf232f5b3b9d408
قاعدة البيانات: OpenAIRE