دورية أكاديمية

SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data

التفاصيل البيبلوغرافية
العنوان: SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data
المؤلفون: Eric M. Davis, Yu Sun, Yanling Liu, Pandurang Kolekar, Ying Shao, Karol Szlachta, Heather L. Mulder, Dongren Ren, Stephen V. Rice, Zhaoming Wang, Joy Nakitandwe, Alexander M. Gout, Bridget Shaner, Salina Hall, Leslie L. Robison, Stanley Pounds, Jeffery M. Klco, John Easton, Xiaotu Ma
المصدر: Genome Biology, Vol 22, Iss 1, Pp 1-18 (2021)
بيانات النشر: BMC, 2021.
سنة النشر: 2021
المجموعة: LCC:Biology (General)
LCC:Genetics
مصطلحات موضوعية: Sequencer/instrument error, Error suppression, DNA sequencing, Biology (General), QH301-705.5, Genetics, QH426-470
الوصف: Abstract Background There is currently no method to precisely measure the errors that occur in the sequencing instrument/sequencer, which is critical for next-generation sequencing applications aimed at discovering the genetic makeup of heterogeneous cellular populations. Results We propose a novel computational method, SequencErr, to address this challenge by measuring the base correspondence between overlapping regions in forward and reverse reads. An analysis of 3777 public datasets from 75 research institutions in 18 countries revealed the sequencer error rate to be ~ 10 per million (pm) and 1.4% of sequencers and 2.7% of flow cells have error rates > 100 pm. At the flow cell level, error rates are elevated in the bottom surfaces and > 90% of HiSeq and NovaSeq flow cells have at least one outlier error-prone tile. By sequencing a common DNA library on different sequencers, we demonstrate that sequencers with high error rates have reduced overall sequencing accuracy, and removal of outlier error-prone tiles improves sequencing accuracy. We demonstrate that SequencErr can reveal novel insights relative to the popular quality control method FastQC and achieve a 10-fold lower error rate than popular error correction methods including Lighter and Musket. Conclusions Our study reveals novel insights into the nature of DNA sequencing errors incurred on DNA sequencers. Our method can be used to assess, calibrate, and monitor sequencer accuracy, and to computationally suppress sequencer errors in existing datasets.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1474-760X
Relation: https://doaj.org/toc/1474-760X
DOI: 10.1186/s13059-020-02254-2
URL الوصول: https://doaj.org/article/b4bdbc0e7ab44eaa83bb934433d4325f
رقم الأكسشن: edsdoj.b4bdbc0e7ab44eaa83bb934433d4325f
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:1474760X
DOI:10.1186/s13059-020-02254-2