Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution

التفاصيل البيبلوغرافية
العنوان: Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution
المؤلفون: Manqi Cai, Molin Yue, Tianmeng Chen, Jinling Liu, Erick Forno, Xinghua Lu, Timothy Billiar, Juan Celedón, Chris McKennan, Wei Chen, Jiebiao Wang
المصدر: Bioinformatics
بيانات النشر: Oxford University Press, 2022.
سنة النشر: 2022
مصطلحات موضوعية: Statistics and Probability, Computational Mathematics, Computational Theory and Mathematics, Sequence Analysis, RNA, RNA, Computer Simulation, RNA-Seq, Transcriptome, Molecular Biology, Biochemistry, Original Papers, Computer Science Applications
الوصف: Motivation Tissue-level omics data such as transcriptomics and epigenomics are an average across diverse cell types. To extract cell-type-specific (CTS) signals, dozens of cellular deconvolution methods have been proposed to infer cell-type fractions from tissue-level data. However, these methods produce vastly different results under various real data settings. Simulation-based benchmarking studies showed no universally best deconvolution approaches. There have been attempts of ensemble methods, but they only aggregate multiple single-cell references or reference-free deconvolution methods. Results To achieve a robust estimation of cellular fractions, we proposed EnsDeconv (Ensemble Deconvolution), which adopts CTS robust regression to synthesize the results from 11 single deconvolution methods, 10 reference datasets, 5 marker gene selection procedures, 5 data normalizations and 2 transformations. Unlike most benchmarking studies based on simulations, we compiled four large real datasets of 4937 tissue samples in total with measured cellular fractions and bulk gene expression from different tissues. Comprehensive evaluations demonstrated that EnsDeconv yields more stable, robust and accurate fractions than existing methods. We illustrated that EnsDeconv estimated cellular fractions enable various CTS downstream analyses such as differential fractions associated with clinical variables. We further extended EnsDeconv to analyze bulk DNA methylation data. Availability and implementation EnsDeconv is freely available as an R-package from https://github.com/randel/EnsDeconv. The RNA microarray data from the TRAUMA study are available and can be accessed in GEO (GSE36809). The demographic and clinical phenotypes can be shared on reasonable request to the corresponding authors. The RNA-seq data from the EVAPR study cannot be shared publicly due to the privacy of individuals that participated in the clinical research in compliance with the IRB approval at the University of Pittsburgh. The RNA microarray data from the FHS study are available from dbGaP (phs000007.v32.p13). The RNA-seq data from ROS study is downloaded from AD Knowledge Portal. Supplementary information Supplementary data are available at Bioinformatics online.
اللغة: English
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::954fb1065c4da0b381bbd95423cb7c4e
https://europepmc.org/articles/PMC9991889/
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....954fb1065c4da0b381bbd95423cb7c4e
قاعدة البيانات: OpenAIRE