Increasing Transparency in Machine Learning through Bootstrap Simulation and Shapely Additive Explanations

التفاصيل البيبلوغرافية
العنوان: Increasing Transparency in Machine Learning through Bootstrap Simulation and Shapely Additive Explanations
المؤلفون: Alexander Huang, Samuel Huang
بيانات النشر: Research Square Platform LLC, 2022.
سنة النشر: 2022
الوصف: Importance: Machine learning methods are widely used within the medical field. However, the reliability and efficacy of these models is difficult to assess. We assessed whether variance calculations of model metrics (e.g., AUROC, Sensitivity, Specificity) through bootstrap simulation and SHapely Additive exPlanations (SHAP) could increase model transparency. Methods Data from the England National Health Services Heart Disease Prediction Cohort was used. XGBoost was used as the machine-learning model of choice in this study. Boost-strap simulation (N = 10,000) was used to empirically derive the distribution of model metrics and covariate Gain statistics. SHapely Additive exPlanations (SHAP) to provide explanations to machine-learning output and simulation to evaluate the variance of model accuracy metrics. Result Among 10,000 simulations completed, we observed that the AUROC ranged from 0.771 to 0.947, a difference of 0.176, the balanced accuracy ranged from 0.688 to 0.894, a 0.205 difference, the sensitivity ranged from 0.632 to 0.939, a 0.307 difference, and the specificity ranged from 0.595 to 0.944, a 0.394 difference. Among 10,000 simulations completed, we observed that the gain for Angina ranged from 0.225 to 0.456, a difference of 0.231, for Cholesterol ranged from 0.148 to 0.326, a difference of 0.178, the MaxHR ranged from 0.081 to 0.200, a range of 0.119, and for Age ranged from 0.059 to 0.157, difference of 0.098. Conclusion Use of simulations to empirically evaluate the variance of model metrics and explanatory algorithms to observe if covariates match the literature are necessary for increased transparency, reliability, and utility of machine learning methods.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::935dffbc053d00b39a023fb3773ddb48
https://doi.org/10.21203/rs.3.rs-2075948/v2
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....935dffbc053d00b39a023fb3773ddb48
قاعدة البيانات: OpenAIRE