دورية أكاديمية

Pairtools: From sequencing data to chromosome contacts.

التفاصيل البيبلوغرافية
العنوان: Pairtools: From sequencing data to chromosome contacts.
المؤلفون: Abdennur N; Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America.; Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America., Fudenberg G; Department of Computational and Quantitative Biology, University of Southern California, Los Angeles, California, United States of America., Flyamer IM; Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland., Galitsyna AA; Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America.; Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Vienna, Austria., Goloborodko A; Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Vienna, Austria., Imakaev M; Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America., Venev SV; Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America.
مؤلفون مشاركون: Open2C
المصدر: PLoS computational biology [PLoS Comput Biol] 2024 May 29; Vol. 20 (5), pp. e1012164. Date of Electronic Publication: 2024 May 29 (Print Publication: 2024).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Public Library of Science Country of Publication: United States NLM ID: 101238922 Publication Model: eCollection Cited Medium: Internet ISSN: 1553-7358 (Electronic) Linking ISSN: 1553734X NLM ISO Abbreviation: PLoS Comput Biol Subsets: MEDLINE
أسماء مطبوعة: Original Publication: San Francisco, CA : Public Library of Science, [2005]-
مواضيع طبية MeSH: Software* , Chromosomes*/genetics , Chromosomes*/chemistry , Computational Biology*/methods, Humans ; Sequence Analysis, DNA/methods ; High-Throughput Nucleotide Sequencing/methods ; Chromosome Mapping/methods
مستخلص: The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools-a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. The core operations provided by pairtools are parsing of.sam alignments into Hi-C pairs, sorting and removal of PCR duplicates. In addition, pairtools provides auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for restriction-based protocols, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright: © 2024 Open2C et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
التعليقات: Update of: bioRxiv. 2023 Feb 15:2023.02.13.528389. doi: 10.1101/2023.02.13.528389. (PMID: 36824968)
References: Wiley Interdiscip Rev Dev Biol. 2021 Nov;10(6):e395. (PMID: 32987449)
Nature. 2017 Sep 13;549(7671):219-226. (PMID: 28905911)
Nat Protoc. 2016 Nov;11(11):2104-21. (PMID: 27685100)
Science. 2018 Aug 31;361(6405):924-928. (PMID: 30166492)
PLoS Comput Biol. 2024 May 6;20(5):e1012067. (PMID: 38709825)
Gigascience. 2021 Feb 16;10(2):. (PMID: 33590861)
Nat Genet. 2018 Dec;50(12):1744-1751. (PMID: 30374068)
Nat Protoc. 2019 Nov;14(11):3243-3272. (PMID: 31619811)
Bioinformatics. 2024 Feb 1;40(2):. (PMID: 38402507)
Bioinformatics. 2009 Aug 15;25(16):2078-9. (PMID: 19505943)
Genome Biol. 2020 Dec 17;21(1):303. (PMID: 33334380)
Mol Cell. 2022 Mar 17;82(6):1225-1238.e6. (PMID: 35196517)
Science. 2009 Oct 9;326(5950):289-93. (PMID: 19815776)
Brief Bioinform. 2021 Nov 5;22(6):. (PMID: 34406348)
Cell. 2017 Oct 5;171(2):305-320.e24. (PMID: 28985562)
Cell. 2014 Dec 18;159(7):1665-80. (PMID: 25497547)
Cell Syst. 2016 Jul;3(1):95-8. (PMID: 27467249)
Nat Methods. 2020 Oct;17(10):1002-1009. (PMID: 32968250)
Nat Methods. 2016 Dec;13(12):1009-1011. (PMID: 27723753)
Nat Methods. 2019 Oct;16(10):991-993. (PMID: 31384045)
Nucleic Acids Res. 2018 Jul 2;46(W1):W11-W16. (PMID: 29901812)
Nat Biotechnol. 2020 Mar;38(3):276-278. (PMID: 32055031)
EMBO J. 2017 Dec 15;36(24):3600-3618. (PMID: 29217590)
Nature. 2020 Sep;585(7825):357-362. (PMID: 32939066)
Nature. 2017 Jul 5;547(7661):61-67. (PMID: 28682332)
Nat Methods. 2020 Mar;17(3):261-272. (PMID: 32015543)
Zebrafish. 2016 Feb;13(1):54-60. (PMID: 26671609)
Nat Methods. 2019 Oct;16(10):999-1006. (PMID: 31501549)
Bioinformatics. 2016 Oct 1;32(19):3047-8. (PMID: 27312411)
Bioinformatics. 2011 Mar 1;27(5):718-9. (PMID: 21208982)
Nature. 2017 Apr 6;544(7648):110-114. (PMID: 28355183)
Nat Commun. 2019 Oct 3;10(1):4486. (PMID: 31582744)
Nat Biotechnol. 2022 Oct;40(10):1488-1499. (PMID: 35637420)
Nat Biotechnol. 2017 Apr 11;35(4):316-319. (PMID: 28398311)
Nat Biotechnol. 2022 Sep;40(9):1332-1335. (PMID: 35332338)
PLoS Comput Biol. 2017 Jul 19;13(7):e1005665. (PMID: 28723903)
Nucleic Acids Res. 2020 Jan 8;48(D1):D882-D889. (PMID: 31713622)
Bioinformatics. 2022 Mar 4;38(6):1729-1731. (PMID: 34978573)
Nat Commun. 2019 Apr 26;10(1):1938. (PMID: 31028255)
Curr Protoc. 2021 Jul;1(7):e198. (PMID: 34286910)
Nat Protoc. 2022 Jun;17(6):1486-1517. (PMID: 35478248)
Nature. 2013 Oct 3;502(7469):59-64. (PMID: 24067610)
Nat Methods. 2021 Sep;18(9):1046-1055. (PMID: 34480151)
Bioinformatics. 2020 Jan 1;36(1):311-316. (PMID: 31290943)
Methods. 2015 Jan 15;72:65-75. (PMID: 25448293)
Nat Commun. 2021 Nov 12;12(1):6566. (PMID: 34772935)
Nat Commun. 2021 Jan 4;12(1):41. (PMID: 33397980)
Nat Genet. 2018 Aug;50(8):1151-1160. (PMID: 29988121)
Nat Commun. 2019 Oct 3;10(1):4485. (PMID: 31582763)
Nat Commun. 2023 Sep 12;14(1):5615. (PMID: 37699887)
Genome Biol. 2015 Dec 01;16:259. (PMID: 26619908)
Bioinformatics. 2018 Sep 15;34(18):3094-3100. (PMID: 29750242)
Nat Methods. 2012 Oct;9(10):999-1003. (PMID: 22941365)
Science. 2013 Nov 22;342(6161):948-53. (PMID: 24200812)
Nature. 2020 Apr;580(7801):142-146. (PMID: 32238933)
Nat Struct Mol Biol. 2020 Dec;27(12):1105-1114. (PMID: 32929283)
معلومات مُعتمدة: R01 HG003143 United States HG NHGRI NIH HHS; R35 GM143116 United States GM NIGMS NIH HHS; UM1 HG011536 United States HG NHGRI NIH HHS
تواريخ الأحداث: Date Created: 20240529 Date Completed: 20240610 Latest Revision: 20240613
رمز التحديث: 20240613
مُعرف محوري في PubMed: PMC11164360
DOI: 10.1371/journal.pcbi.1012164
PMID: 38809952
قاعدة البيانات: MEDLINE
الوصف
تدمد:1553-7358
DOI:10.1371/journal.pcbi.1012164