دورية أكاديمية

Robust SNP-based prediction of rheumatoid arthritis through machine-learning-optimized polygenic risk score

التفاصيل البيبلوغرافية
العنوان: Robust SNP-based prediction of rheumatoid arthritis through machine-learning-optimized polygenic risk score
المؤلفون: Ashley J. W. Lim, C. Tera Tyniana, Lee Jin Lim, Justina Wei Lynn Tan, Ee Tzun Koh, TTSH Rheumatoid Arthritis Study Group, Samuel S. Chong, Chiea Chuen Khor, Khai Pang Leong, Caroline G. Lee
المصدر: Journal of Translational Medicine, Vol 21, Iss 1, Pp 1-17 (2023)
بيانات النشر: BMC, 2023.
سنة النشر: 2023
المجموعة: LCC:Medicine
مصطلحات موضوعية: Machine-learning, Polygenic risk score, Rheumatoid arthritis, Single nucleotide polymorphisms, Medicine
الوصف: Abstract Background The popular statistics-based Genome-wide association studies (GWAS) have provided deep insights into the field of complex disorder genetics. However, its clinical applicability to predict disease/trait outcomes remains unclear as statistical models are not designed to make predictions. This study employs statistics-free machine-learning (ML)-optimized polygenic risk score (PRS) to complement existing GWAS and bring the prediction of disease/trait outcomes closer to clinical application. Rheumatoid Arthritis (RA) was selected as a model disease to demonstrate the robustness of ML in disease prediction as RA is a prevalent chronic inflammatory joint disease with high mortality rates, affecting adults at the economic prime. Early identification of at-risk individuals may facilitate measures to mitigate the effects of the disease. Methods This study employs a robust ML feature selection algorithm to identify single nucleotide polymorphisms (SNPs) that can predict RA from a set of training data comprising RA patients and population control samples. Thereafter, selected SNPs were evaluated for their predictive performances across 3 independent, unseen test datasets. The selected SNPs were subsequently used to generate PRS which was also evaluated for its predictive capacity as a sole feature. Results Through robust ML feature selection, 9 SNPs were found to be the minimum number of features for excellent predictive performance (AUC > 0.9) in 3 independent, unseen test datasets. PRS based on these 9 SNPs was significantly associated with (P 0.9) of RA in the 3 unseen datasets. A RA ML-PRS calculator of these 9 SNPs was developed ( https://xistance.shinyapps.io/prs-ra/ ) to facilitate individualized clinical applicability. The majority of the predictive SNPs are protective, reside in non-coding regions, and are either predicted to be potentially functional SNPs (pfSNPs) or in high linkage disequilibrium (r2 > 0.8) with un-interrogated pfSNPs. Conclusions These findings highlight the promise of this ML strategy to identify useful genetic features that can robustly predict disease and amenable to translation for clinical application.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1479-5876
Relation: https://doaj.org/toc/1479-5876
DOI: 10.1186/s12967-023-03939-5
URL الوصول: https://doaj.org/article/87deb690648f4aca92310084051267ce
رقم الأكسشن: edsdoj.87deb690648f4aca92310084051267ce
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:14795876
DOI:10.1186/s12967-023-03939-5