دورية أكاديمية

Molecular Representations in Machine-Learning-Based Prediction of PK Parameters for Insulin Analogs.

التفاصيل البيبلوغرافية
العنوان: Molecular Representations in Machine-Learning-Based Prediction of PK Parameters for Insulin Analogs.
المؤلفون: Einarson KA; Danish Technical University (DTU), Applied Mathematics and Computer Science, Kongens Lyngby 2800, Denmark.; Novo Nordisk A/S, Global Drug Discovery, Research & Early Development (R&ED), Måløv 2760, Denmark., Bendtsen KM; Novo Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark., Li K; Novo Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark., Thomsen M; Novo Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark., Kristensen NR; Novo Nordisk A/S, Data Science, Development, Søborg 2860, Denmark., Winther O; Danish Technical University (DTU), Applied Mathematics and Computer Science, Kongens Lyngby 2800, Denmark.; Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen 2100, Denmark.; Department of Biology, Bioinformatics Centre, University of Copenhagen, Copenhagen 2200, Denmark., Fulle S; Novo Nordisk A/S, Digital Science & Innovation, R&ED, Måløv 2760, Denmark., Clemmensen L; Danish Technical University (DTU), Applied Mathematics and Computer Science, Kongens Lyngby 2800, Denmark., Refsgaard HHF; Novo Nordisk A/S, Global Drug Discovery, Research & Early Development (R&ED), Måløv 2760, Denmark.
المصدر: ACS omega [ACS Omega] 2023 Jun 22; Vol. 8 (26), pp. 23566-23578. Date of Electronic Publication: 2023 Jun 22 (Print Publication: 2023).
نوع المنشور: Journal Article
اللغة: English
بيانات الدورية: Publisher: American Chemical Society Country of Publication: United States NLM ID: 101691658 Publication Model: eCollection Cited Medium: Internet ISSN: 2470-1343 (Electronic) Linking ISSN: 24701343 NLM ISO Abbreviation: ACS Omega Subsets: PubMed not MEDLINE
أسماء مطبوعة: Original Publication: Washington, D.C. : American Chemical Society, [2016]-
مستخلص: Therapeutic peptides and proteins derived from either endogenous hormones, such as insulin, or de novo design via display technologies occupy a distinct pharmaceutical space in between small molecules and large proteins such as antibodies. Optimizing the pharmacokinetic (PK) profile of drug candidates is of high importance when it comes to prioritizing lead candidates, and machine-learning models can provide a relevant tool to accelerate the drug design process. Predicting PK parameters of proteins remains difficult due to the complex factors that influence PK properties; furthermore, the data sets are small compared to the variety of compounds in the protein space. This study describes a novel combination of molecular descriptors for proteins such as insulin analogs, where many contained chemical modifications, e.g., attached small molecules for protraction of the half-life. The underlying data set consisted of 640 structural diverse insulin analogs, of which around half had attached small molecules. Other analogs were conjugated to peptides, amino acid extensions, or fragment crystallizable regions. The PK parameters clearance (CL), half-life (T1/2), and mean residence time (MRT) could be predicted by using classical machine-learning models such as Random Forest (RF) and Artificial Neural Networks (ANN) with root-mean-square errors of CL of 0.60 and 0.68 (log units) and average fold errors of 2.5 and 2.9 for RF and ANN, respectively. Both random and temporal data splittings were employed to evaluate ideal and prospective model performance with the best models, regardless of data splitting, achieving a minimum of 70% of predictions within a twofold error. The tested molecular representations include (1) global physiochemical descriptors combined with descriptors encoding the amino acid composition of the insulin analogs, (2) physiochemical descriptors of the attached small molecule, (3) protein language model (evolutionary scale modeling) embedding of the amino acid sequence of the molecules, and (4) a natural language processing inspired embedding (mol2vec) of the attached small molecule. Encoding the attached small molecule via (2) or (4) significantly improved the predictions, while the benefit of using the protein language model-based encoding (3) depended on the used machine-learning model. The most important molecular descriptors were identified as descriptors related to the molecular size of both the protein and protraction part using Shapley additive explanations values. Overall, the results show that combining representations of proteins and small molecules was key for PK predictions of insulin analogs.
Competing Interests: The authors declare the following competing financial interest(s): KAE, KMB, KI, MT, NRK, SF & HHFR are all employees and minor stockholders at Novo Nordisk A/S. Only a very small number of therapeutic proteins with small-molecule attachments has publicly available in vivo PK data. The manuscript is therefore based on Novo Nordisk A/S proprietary data.
(© 2023 The Authors. Published by American Chemical Society.)
References: J Chem Inf Model. 2019 Sep 23;59(9):3968-3980. (PMID: 31403793)
Clin Pharmacokinet. 2006;45(5):511-42. (PMID: 16640456)
Biomedicines. 2021 Jan 05;9(1):. (PMID: 33466380)
Nature. 2013 Jan 10;493(7431):241-5. (PMID: 23302862)
PLoS One. 2018 Jun 1;13(6):e0196829. (PMID: 29856745)
BMC Bioinformatics. 2020 Jun 9;21(1):235. (PMID: 32517697)
Curr Top Med Chem. 2008;8(18):1555-72. (PMID: 19075767)
J Chem Inf Model. 2013 Apr 22;53(4):783-90. (PMID: 23521722)
Trends Biotechnol. 2015 Jan;33(1):27-34. (PMID: 25488117)
J Comput Aided Mol Des. 2020 Jul;34(7):709-715. (PMID: 32468207)
J Chem Inf Model. 2019 Nov 25;59(11):4893-4905. (PMID: 31714067)
J Med Chem. 2015 Sep 24;58(18):7370-80. (PMID: 26308095)
Clin Pharmacokinet. 2011 May;50(5):331-47. (PMID: 21456633)
Nucleic Acids Res. 2022 Jul 5;50(W1):W510-W515. (PMID: 35648435)
J Chem Inf Model. 2020 Oct 26;60(10):4603-4613. (PMID: 32804486)
Mol Pharm. 2021 Mar 1;18(3):1071-1079. (PMID: 33512165)
J Med Chem. 2021 Jan 14;64(1):616-628. (PMID: 33356257)
MAbs. 2021 Jan-Dec;13(1):1932230. (PMID: 34116620)
Mol Pharm. 2021 Dec 6;18(12):4520-4530. (PMID: 34758626)
Bioinformatics. 2018 Aug 1;34(15):2605-2613. (PMID: 29554211)
J Cheminform. 2021 Feb 8;13(1):7. (PMID: 33557952)
Nat Mach Intell. 2020 Jan;2(1):56-67. (PMID: 32607472)
Drug Discov Today. 2022 Feb;27(2):529-537. (PMID: 34592448)
J Chem Inf Comput Sci. 2002 Nov-Dec;42(6):1273-80. (PMID: 12444722)
Bioinformatics. 2018 Aug 1;34(15):2642-2648. (PMID: 29584811)
Nat Rev Drug Discov. 2021 Apr;20(4):309-325. (PMID: 33536635)
Front Robot AI. 2019 Nov 05;6:108. (PMID: 33501123)
PLoS One. 2017 Jul 31;12(7):e0181748. (PMID: 28759605)
Diabetol Metab Syndr. 2015 Jun 26;7:57. (PMID: 26136850)
Expert Opin Biol Ther. 2016 Jul;16(7):903-15. (PMID: 26967759)
J Chem Inf Model. 2018 Jan 22;58(1):27-35. (PMID: 29268609)
Nat Rev Drug Discov. 2023 Jan;22(1):59-80. (PMID: 36002588)
Neural Comput. 1998 Sep 15;10(7):1895-1923. (PMID: 9744903)
Biochemistry. 2008 Apr 22;47(16):4743-51. (PMID: 18376848)
Mol Pharm. 2019 Feb 4;16(2):533-541. (PMID: 30571137)
J Chem Inf Model. 2010 May 24;50(5):742-54. (PMID: 20426451)
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15):. (PMID: 33876751)
J Med Chem. 2021 Jul 8;64(13):8942-8950. (PMID: 33944562)
Diabetes Care. 1999 Sep;22(9):1501-6. (PMID: 10480516)
J Comput Aided Mol Des. 2016 Aug;30(8):595-608. (PMID: 27558503)
Drug Discov Today. 2018 Jun;23(6):1241-1250. (PMID: 29366762)
ACS Med Chem Lett. 2018 Jun 15;9(7):577-580. (PMID: 30034579)
J Chem Inf Model. 2020 Jun 22;60(6):2773-2790. (PMID: 32250622)
JCI Insight. 2019 Feb 26;5:. (PMID: 30830873)
تواريخ الأحداث: Date Created: 20230710 Latest Revision: 20230718
رمز التحديث: 20230718
مُعرف محوري في PubMed: PMC10324072
DOI: 10.1021/acsomega.3c01218
PMID: 37426277
قاعدة البيانات: MEDLINE
الوصف
تدمد:2470-1343
DOI:10.1021/acsomega.3c01218