دورية أكاديمية

scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.

التفاصيل البيبلوغرافية
العنوان: scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.
المؤلفون: Magaña-López G; Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Talence, France., Calzone L; Institut Curie, Université PSL, Paris, France.; INSERM, U900, Paris, France.; Mines ParisTech, Université PSL, Paris, France., Zinovyev A; In silico R&D, Evotec, Toulouse, France., Paulevé L; Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Talence, France.
المصدر: PLoS computational biology [PLoS Comput Biol] 2024 Jul 08; Vol. 20 (7), pp. e1011620. Date of Electronic Publication: 2024 Jul 08 (Print Publication: 2024).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Public Library of Science Country of Publication: United States NLM ID: 101238922 Publication Model: eCollection Cited Medium: Internet ISSN: 1553-7358 (Electronic) Linking ISSN: 1553734X NLM ISO Abbreviation: PLoS Comput Biol Subsets: MEDLINE
أسماء مطبوعة: Original Publication: San Francisco, CA : Public Library of Science, [2005]-
مواضيع طبية MeSH: Computational Biology*/methods , Single-Cell Analysis*/methods , Single-Cell Analysis*/statistics & numerical data, Humans ; RNA-Seq/methods ; RNA-Seq/statistics & numerical data ; Gene Expression Profiling/methods ; Gene Expression Profiling/statistics & numerical data ; Sequence Analysis, RNA/methods ; Sequence Analysis, RNA/statistics & numerical data ; Algorithms ; Gene Regulatory Networks/genetics ; Models, Statistical ; Software ; Single-Cell Gene Expression Analysis
مستخلص: Boolean networks are largely employed to model the qualitative dynamics of cell fate processes by describing the change of binary activation states of genes and transcription factors with time. Being able to bridge such qualitative states with quantitative measurements of gene expression in cells, as scRNA-seq, is a cornerstone for data-driven model construction and validation. On one hand, scRNA-seq binarisation is a key step for inferring and validating Boolean models. On the other hand, the generation of synthetic scRNA-seq data from baseline Boolean models provides an important asset to benchmark inference methods. However, linking characteristics of scRNA-seq datasets, including dropout events, with Boolean states is a challenging task. We present scBoolSeq, a method for the bidirectional linking of scRNA-seq data and Boolean activation state of genes. Given a reference scRNA-seq dataset, scBoolSeq computes statistical criteria to classify the empirical gene pseudocount distributions as either unimodal, bimodal, or zero-inflated, and fit a probabilistic model of dropouts, with gene-dependent parameters. From these learnt distributions, scBoolSeq can perform both binarisation of scRNA-seq datasets, and generate synthetic scRNA-seq datasets from Boolean traces, as issued from Boolean networks, using biased sampling and dropout simulation. We present a case study demonstrating the application of scBoolSeq's binarisation scheme in data-driven model inference. Furthermore, we compare synthetic scRNA-seq data generated by scBoolSeq with BoolODE's, data for the same Boolean Network model. The comparison shows that our method better reproduces the statistics of real scRNA-seq datasets, such as the mean-variance and mean-dropout relationships while exhibiting clearly defined trajectories in two-dimensional projections of the data.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright: © 2024 Magaña-López et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
References: Genome Biol. 2017 Sep 12;18(1):174. (PMID: 28899397)
PLoS One. 2011;6(8):e22649. (PMID: 21853041)
Nat Commun. 2019 Apr 23;10(1):1903. (PMID: 31015418)
Genome Biol. 2014;15(12):550. (PMID: 25516281)
Front Physiol. 2018 Jun 19;9:680. (PMID: 29971009)
BMC Syst Biol. 2012 Aug 29;6:116. (PMID: 22932419)
PLoS One. 2012;7(4):e34729. (PMID: 22558096)
Bioinformatics. 2020 May 1;36(10):3276-3278. (PMID: 32065619)
BMC Syst Biol. 2009 Sep 28;3:98. (PMID: 19785753)
Front Bioeng Biotechnol. 2015 Jan 28;2:86. (PMID: 25674559)
Development. 2019 Sep 9;146(17):. (PMID: 31399471)
Bioinformatics. 2020 Feb 15;36(4):1174-1181. (PMID: 31584606)
Cell Rep. 2016 Feb 2;14(4):956-965. (PMID: 26804902)
Nat Methods. 2021 Jul;18(7):723-732. (PMID: 34155396)
Stat Med. 2013 Jan 30;32(2):230-9. (PMID: 22806695)
PLoS Comput Biol. 2016 Jan 11;12(1):e1004696. (PMID: 26751566)
Science. 2015 Mar 6;347(6226):1138-42. (PMID: 25700174)
Sci Rep. 2017 Jul 20;7(1):6031. (PMID: 28729663)
Sci Rep. 2020 Aug 31;10(1):14333. (PMID: 32868786)
BMC Bioinformatics. 2017 Mar 22;18(Suppl 4):134. (PMID: 28361666)
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):973-9. (PMID: 22144531)
Nat Genet. 2016 Aug;48(8):838-47. (PMID: 27322546)
Theor Biol Med Model. 2015 Nov 16;12:26. (PMID: 26573569)
Nat Genet. 2021 Mar;53(3):304-312. (PMID: 33664506)
Commun Biol. 2020 Apr 23;3(1):188. (PMID: 32327715)
FEBS Lett. 2020 Jan;594(2):227-239. (PMID: 31545515)
Biosystems. 2013 Aug;113(2):96-103. (PMID: 23743337)
Genome Biol. 2018 Feb 6;19(1):15. (PMID: 29409532)
Nat Methods. 2017 Oct;14(10):979-982. (PMID: 28825705)
PLoS Comput Biol. 2015 Aug 28;11(8):e1004426. (PMID: 26317215)
Nucleic Acids Res. 2022 Jul 5;50(W1):W398-W404. (PMID: 35609981)
Genome Res. 2019 Aug;29(8):1363-1375. (PMID: 31340985)
PLoS One. 2013 Aug 01;8(8):e69573. (PMID: 23936338)
PLoS Comput Biol. 2013 Oct;9(10):e1003286. (PMID: 24250280)
Biol Rev Camb Philos Soc. 2017 May;92(2):953-963. (PMID: 27061969)
Bioinformatics. 2020 Mar 1;36(5):1468-1475. (PMID: 31598633)
Comput Methods Programs Biomed. 2020 Aug;192:105473. (PMID: 32305736)
PLoS Comput Biol. 2015 Nov 03;11(11):e1004571. (PMID: 26528548)
Comput Struct Biotechnol J. 2022 Nov 02;21:21-33. (PMID: 36514338)
Bioinformatics. 2019 Dec 15;35(24):5155-5162. (PMID: 31197307)
Bioinformatics. 2017 Jul 01;33(13):1953-1962. (PMID: 28334101)
Cancer Inform. 2009 Aug 05;7:199-216. (PMID: 19718451)
Bioinformatics. 2019 Sep 1;35(17):3102-3109. (PMID: 30657860)
Cell Syst. 2020 Sep 23;11(3):252-271.e11. (PMID: 32871105)
Nat Commun. 2020 Mar 3;11(1):1169. (PMID: 32127540)
Front Genet. 2016 Apr 14;7:44. (PMID: 27148350)
Front Bioeng Biotechnol. 2018 Nov 13;6:165. (PMID: 30483498)
Cell. 2019 Jun 13;177(7):1888-1902.e21. (PMID: 31178118)
Nat Commun. 2020 Aug 26;11(1):4256. (PMID: 32848126)
Nat Methods. 2023 May;20(5):665-672. (PMID: 37037999)
Nat Commun. 2019 Apr 3;10(1):1523. (PMID: 30944313)
Bioinformatics. 2010 May 15;26(10):1378-80. (PMID: 20378558)
BMC Syst Biol. 2012 Oct 18;6:133. (PMID: 23079107)
Comput Struct Biotechnol J. 2021 Sep 15;19:5321-5332. (PMID: 34630946)
Mol Syst Biol. 2019 Jun 19;15(6):e8746. (PMID: 31217225)
Front Physiol. 2019 Jan 24;9:1965. (PMID: 30733688)
Nat Methods. 2020 Feb;17(2):147-154. (PMID: 31907445)
Brief Funct Genomics. 2020 Jul 29;19(4):286-291. (PMID: 32232401)
Nat Biotechnol. 2023 May;41(5):604-606. (PMID: 37037904)
Blood. 2016 Aug 25;128(8):e20-31. (PMID: 27365425)
Cancer Converg. 2017;1(1):5. (PMID: 29623959)
FEBS Lett. 2017 Aug;591(15):2213-2225. (PMID: 28524227)
Nat Biotechnol. 2015 May;33(5):495-502. (PMID: 25867923)
تواريخ الأحداث: Date Created: 20240708 Date Completed: 20240718 Latest Revision: 20240720
رمز التحديث: 20240720
مُعرف محوري في PubMed: PMC11257695
DOI: 10.1371/journal.pcbi.1011620
PMID: 38976751
قاعدة البيانات: MEDLINE
الوصف
تدمد:1553-7358
DOI:10.1371/journal.pcbi.1011620