Sigmoni: classification of nanopore signal with a compressed pangenome index.

التفاصيل البيبلوغرافية
العنوان: Sigmoni: classification of nanopore signal with a compressed pangenome index.
المؤلفون: Shivakumar VS; Department of Computer Science, Johns Hopkins University., Ahmed OY; Department of Computer Science, Johns Hopkins University., Kovaka S; Department of Computer Science, Johns Hopkins University., Zakeri M; Department of Computer Science, Johns Hopkins University., Langmead B; Department of Computer Science, Johns Hopkins University.
المصدر: BioRxiv : the preprint server for biology [bioRxiv] 2023 Aug 30. Date of Electronic Publication: 2023 Aug 30.
نوع المنشور: Preprint
اللغة: English
بيانات الدورية: Country of Publication: United States NLM ID: 101680187 Publication Model: Electronic Cited Medium: Internet ISSN: 2692-8205 (Electronic) Linking ISSN: 26928205 NLM ISO Abbreviation: bioRxiv Subsets: PubMed not MEDLINE
مستخلص: Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the r -index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics. Sigmoni is 10-100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes.
Competing Interests: 9 Competing interests SK has received travel funding from Oxford Nanopore Technologies Limited.
التعليقات: Update in: Bioinformatics. 2024 Jun 28;40(Supplement_1):i287-i296. doi: 10.1093/bioinformatics/btae213. (PMID: 38940135)
معلومات مُعتمدة: R01 HG011392 United States HG NHGRI NIH HHS; U01 CA253481 United States CA NCI NIH HHS
تواريخ الأحداث: Date Created: 20230830 Latest Revision: 20240708
رمز التحديث: 20240708
مُعرف محوري في PubMed: PMC10462034
DOI: 10.1101/2023.08.15.553308
PMID: 37645873
قاعدة البيانات: MEDLINE
الوصف
تدمد:2692-8205
DOI:10.1101/2023.08.15.553308