دورية أكاديمية

Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer.

التفاصيل البيبلوغرافية
العنوان: Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer.
المؤلفون: Thalor A; Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India., Kumar Joon H; Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India.; Regional Centre for Biotechnology, Faridabad 121001, Haryana, India., Singh G; Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India., Roy S; Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India., Gupta D; Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India.
المصدر: Computational and structural biotechnology journal [Comput Struct Biotechnol J] 2022 Mar 24; Vol. 20, pp. 1618-1631. Date of Electronic Publication: 2022 Mar 24 (Print Publication: 2022).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology Country of Publication: Netherlands NLM ID: 101585369 Publication Model: eCollection Cited Medium: Print ISSN: 2001-0370 (Print) Linking ISSN: 20010370 NLM ISO Abbreviation: Comput Struct Biotechnol J Subsets: PubMed not MEDLINE
أسماء مطبوعة: Publication: Amsterdam : Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology
Original Publication: Gothenburg, Sweden : Research Network of Computational and Structural Biotechnology
مستخلص: Tumor heterogeneity and the unclear metastasis mechanisms are the leading cause for the unavailability of effective targeted therapy for Triple-negative breast cancer (TNBC), a breast cancer (BrCa) subtype characterized by high mortality and high frequency of distant metastasis cases. The identification of prognostic biomarker can improve prognosis and personalized treatment regimes. Herein, we collected gene expression datasets representing TNBC and Non-TNBC BrCa. From the complete dataset, a subset reflecting solely known cancer driver genes was also constructed. Recursive Feature Elimination (RFE) was employed to identify top 20, 25, 30, 35, 40, 45, and 50 gene signatures that differentiate TNBC from the other BrCa subtypes. Five machine learning algorithms were employed on these selected features and on the basis of model performance evaluation, it was found that for the complete and driver dataset, XGBoost performs the best for a subset of 25 and 20 genes, respectively. Out of these 45 genes from the two datasets, 34 genes were found to be differentially regulated. The Kaplan-Meier (KM) analysis for Distant Metastasis Free Survival (DMFS) of these 34 differentially regulated genes revealed four genes, out of which two are novel that could be potential prognostic genes ( POU2AF1 and S100B ). Finally, interactome and pathway enrichment analyses were carried out to investigate the functional role of the identified potential prognostic genes in TNBC. These genes are associated with MAPK, PI3-AkT, Wnt, TGF-β, and other signal transduction pathways, pivotal in metastasis cascade. These gene signatures can provide novel molecular-level insights into metastasis.
Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(© 2022 The Authors.)
References: Breast Cancer Res. 2019 Feb 22;21(1):30. (PMID: 30795773)
Nat Commun. 2014 Jul 22;5:4416. (PMID: 25048699)
Cell Physiol Biochem. 2018;50(2):473-488. (PMID: 30308479)
Breast Cancer Res Treat. 2011 Feb;125(3):627-36. (PMID: 21161370)
Semin Cancer Biol. 2017 Dec;47:177-184. (PMID: 28823497)
Lancet. 2017 Jun 17;389(10087):2430-2442. (PMID: 27939063)
Br J Cancer. 2014 Oct 14;111(8):1532-41. (PMID: 25101563)
BMC Bioinformatics. 2011 Aug 04;12:322. (PMID: 21816037)
Nat Protoc. 2009;4(8):1184-91. (PMID: 19617889)
Nucleic Acids Res. 2019 Jan 8;47(D1):D941-D947. (PMID: 30371878)
Cold Spring Harb Symp Quant Biol. 1999;64:119-31. (PMID: 11232276)
Nat Immunol. 2003 Jun;4(6):525-32. (PMID: 12717432)
Cell. 2006 Nov 3;127(3):469-80. (PMID: 17081971)
Sci Rep. 2020 Jun 26;10(1):10464. (PMID: 32591639)
Antioxid Redox Signal. 2016 Aug 20;25(6):337-70. (PMID: 27116998)
Cell. 2018 Apr 5;173(2):371-385.e18. (PMID: 29625053)
Oncogene. 2008 Aug 7;27(34):4657-65. (PMID: 18408767)
Ann Oncol. 2013 Sep;24(9):2206-23. (PMID: 23917950)
Immunol Rev. 2005 Aug;206:177-89. (PMID: 16048549)
Bioinformatics. 2004 Feb 12;20(3):307-15. (PMID: 14960456)
Mol Cancer Ther. 2018 Dec;17(12):2689-2701. (PMID: 30237308)
Nat Commun. 2017 Nov 23;8(1):1734. (PMID: 29170406)
Breast Cancer Res Treat. 2017 Jan;161(2):279-287. (PMID: 27888421)
Oncologist. 2018 Apr;23(4):481-488. (PMID: 29330212)
Mol Cancer. 2013 Jun 12;12:61. (PMID: 23758992)
CA Cancer J Clin. 2021 May;71(3):209-249. (PMID: 33538338)
Mol Cancer Ther. 2009 Sep;8(9):2645-54. (PMID: 19706735)
Breast Cancer. 2019 Nov;26(6):784-791. (PMID: 31197620)
Front Biosci (Landmark Ed). 2011 Jun 01;16(7):2561-71. (PMID: 21622195)
Cell Adh Migr. 2015;9(4):317-24. (PMID: 26241004)
Eur J Immunol. 1996 Dec;26(12):3214-8. (PMID: 8977324)
Nature. 1996 Oct 10;383(6600):542-7. (PMID: 8849728)
Nucleic Acids Res. 2015 Apr 20;43(7):e47. (PMID: 25605792)
Int J Oncol. 2018 Sep;53(3):937-948. (PMID: 29956756)
Cancer Biol Med. 2015 Jun;12(2):106-16. (PMID: 26175926)
Cell. 2011 Oct 14;147(2):275-92. (PMID: 22000009)
Breast Cancer Res Treat. 2010 Oct;123(3):725-31. (PMID: 20020197)
Cancer Discov. 2020 Nov;10(11):1706-1721. (PMID: 32690540)
Trends Immunol. 2003 Oct;24(10):546-53. (PMID: 14552839)
Ann Med Surg (Lond). 2019 Dec 06;49:44-48. (PMID: 31890196)
DNA Res. 2009 Aug;16(4):227-35. (PMID: 19675110)
Ann Oncol. 2011 Aug;22(8):1736-47. (PMID: 21709140)
BMC Bioinformatics. 2018 Jul 13;19(1):262. (PMID: 30001694)
Nat Rev Mol Cell Biol. 2014 Mar;15(3):178-96. (PMID: 24556840)
Drug Deliv Transl Res. 2018 Oct;8(5):1483-1507. (PMID: 29978332)
BMC Biol. 2021 Apr 12;19(1):70. (PMID: 33845831)
Nucleic Acids Res. 2021 Jan 8;49(D1):D545-D551. (PMID: 33125081)
Pathobiology. 2015 Sep;82(3-4):133-41. (PMID: 26330354)
Nat Commun. 2015 Jan 09;6:5987. (PMID: 25574598)
iScience. 2021 Apr 19;24(5):102451. (PMID: 34007962)
J Exp Med. 2012 Dec 17;209(13):2467-83. (PMID: 23230003)
J Immunol. 2016 Apr 1;196(7):3159-67. (PMID: 26927796)
Biochem Pharmacol. 2008 Dec 1;76(11):1352-64. (PMID: 18708031)
Int J Oncol. 2018 Feb;52(2):433-440. (PMID: 29345293)
Nucleic Acids Res. 2009 Jan;37(1):1-13. (PMID: 19033363)
Mol Med Rep. 2020 Feb;21(2):557-566. (PMID: 31974598)
Nature. 1996 Oct 10;383(6600):538-42. (PMID: 8849727)
Sci Rep. 2020 Jan 27;10(1):1212. (PMID: 31988390)
Proc Natl Acad Sci U S A. 2013 Mar 5;110(10):3931-6. (PMID: 23417300)
Cancer. 2012 Nov 15;118(22):5463-72. (PMID: 22544643)
Sci Rep. 2021 Jun 9;11(1):12172. (PMID: 34108519)
Clin Cancer Res. 2008 Mar 1;14(5):1368-76. (PMID: 18316557)
Comput Struct Biotechnol J. 2021 Jul 18;19:4101-4109. (PMID: 34527184)
BMC Bioinformatics. 2003 Jan 13;4:2. (PMID: 12525261)
Nucleic Acids Res. 2000 Jan 1;28(1):27-30. (PMID: 10592173)
Nucleic Acids Res. 2017 Jan 4;45(D1):D362-D368. (PMID: 27924014)
Cancer Commun (Lond). 2021 Nov;41(11):1100-1115. (PMID: 34613667)
Nat Methods. 2013 Nov;10(11):1081-2. (PMID: 24037244)
Biochim Biophys Acta Mol Basis Dis. 2020 Aug 1;1866(8):165822. (PMID: 32360590)
Biostatistics. 2010 Apr;11(2):242-53. (PMID: 20097884)
J Biol Chem. 2020 Aug 14;295(33):11707-11719. (PMID: 32576660)
Am J Pathol. 1999 Oct;155(4):1033-8. (PMID: 10514384)
PET Clin. 2018 Jul;13(3):325-338. (PMID: 30100073)
Clin Cancer Res. 2010 Jan 15;16(2):376-83. (PMID: 20068082)
BMC Cancer. 2021 Aug 9;21(1):906. (PMID: 34372798)
فهرسة مساهمة: Keywords: AUC, Area under the ROC curve; BrCa, Breast cancer; COSMIC, The catalogue of somatic mutations in cancer; CX-25, Complete XgBoost top 25; DE, Differential Expression; DMFS, Distasnt metastasis free survival; DX-20, Driver XgBoost top 20; Differential gene expression; Distant-metastasis free survival; EMT, Epithelial to mesenchymal transition; ER, Oestrogen Receptor; FDR, False discovery rate; GEO, Gene expression omnibous; HER2, Human epidermal growth factor receptor 2; KM, Kaplan Meier; ML, Machine learning; NSCLC, Non small cell lung carcinoma; OS, Overall survival; PCA, Principal component analysis; POU2AF1; PR, Progesterone receptor; Prognostic gene signatures; RF, Random forest; RFE, Recursive feature elimination; ROC, Receiver operating characteristics curve; S100B; SVM, Support vector machine; TNBC; TNBC, Triple negative breast cancer; kNN, k Nearest neighbors
تواريخ الأحداث: Date Created: 20220425 Latest Revision: 20220716
رمز التحديث: 20231215
مُعرف محوري في PubMed: PMC9014315
DOI: 10.1016/j.csbj.2022.03.019
PMID: 35465161
قاعدة البيانات: MEDLINE
الوصف
تدمد:2001-0370
DOI:10.1016/j.csbj.2022.03.019