A methodology based on eXplainable Artificial Intelligence for identifying, measuring, and reporting differences of feature effects between subpopulations
Citation
Share
Abstract
In the same way that it is a mistake to deal with a cold and influenza with the same treatment because both are respiratory illnesses, it is not optimal to take the same approach with two distinct groups just because they have the same problem or are in a similar situation. It is important to realize that the inherent differences between groups will influence the effect of the different features or variables that make up that scenario on the outcome. While we could attribute these effects to the distinction in groups, that makes it impossible for us to act in cases where the category is immutable. In this Thesis I present a novel methodology regarding sub-population comparison using Shapley values and feature effect concepts. This methodology is capable of identifying variations in feature effects in importance, magnitude, and direction between groups, and present these results in an explainable manner. The methodology presented is a model and situation agnostic one, allowing for its use on several different situations and cases. The resulting explanations are also given in the context of the features of each case, meaning that they could be used for the development of interventions (both global or individual), or for better understanding the situation at hand. The methodology was built under the principles and goals of explainable AI, and validated by expert opinions on the case study fields. With the development of this methodology and its validation, we give evidence that it is possible to identify variations in feature effects in importance, magnitude, and direction between groups using Explainable AI, and that these differences can be used to provide recommendations for personalized interventions for individuals or groups. The resulting methodology is useful in cases where explanations of predictive models are not only desirable but essential, as the provided explanations can be used for the development of interventions or counterfactuals where needed, even if the final user is not extensively trained in data science.
Description
https://orcid.org/0000-0002-2460-3442