BELLA: Black box model Explanations by Local Linear Approximations

التفاصيل البيبلوغرافية
العنوان: BELLA: Black box model Explanations by Local Linear Approximations
المؤلفون: Radulovic, Nedeljko, Bifet, Albert, Suchanek, Fabian
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
الوصف: In recent years, understanding the decision-making process of black-box models has become not only a legal requirement but also an additional way to assess their performance. However, the state of the art post-hoc interpretation approaches rely on synthetic data generation. This introduces uncertainty and can hurt the reliability of the interpretations. Furthermore, they tend to produce explanations that apply to only very few data points. This makes the explanations brittle and limited in scope. Finally, they provide scores that have no direct verifiable meaning. In this paper, we present BELLA, a deterministic model-agnostic post-hoc approach for explaining the individual predictions of regression black-box models. BELLA provides explanations in the form of a linear model trained in the feature space. Thus, its coefficients can be used directly to compute the predicted value from the feature values. Furthermore, BELLA maximizes the size of the neighborhood to which the linear model applies, so that the explanations are accurate, simple, general, and robust. BELLA can produce both factual and counterfactual explanations. Our user study confirms the importance of the desiderata we optimize, and our experiments show that BELLA outperforms the state-of-the-art approaches on these desiderata.
Comment: 21 pages,3 figures, submitted to Journal of Artificial Intelligence
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2305.11311
رقم الأكسشن: edsarx.2305.11311
قاعدة البيانات: arXiv