2.4 Evaluating Interpretability

There is no real consensus on what interpretability in machine learning is. Also it is not clear how to measure it. But there is some first research on it and the attempt to formulate some approaches for the evaluation, as described in the following section.

Doshi-Velez and Kim (2017) propose three major levels when evaluating interpretability:

Application level evaluation (real task): Put the explanation into the product and let the end user test it. For example, on an application level, radiologists would test fracture detection software (which includes a machine learning component to suggest where fractures might be in an x-ray image) directly in order to evaluate the model. This requires a good experimental setup and an idea of how to assess the quality. A good baseline for this is always how good a human would be at explaining the same decision.

Human level evaluation (simple task) is a simplified application level evaluation. The difference is that these experiments are not conducted with the domain experts, but with lay humans. This makes experiments less expensive (especially when the domain experts are radiologists) and it is easier to find more humans. An example would be to show a user different explanations and the human would choose the best.

Function level evaluation (proxy task) does not require any humans. This works best when the class of models used was already evaluated by someone else in a human level evaluation. For example it might be known that the end users understand decision trees. In this case, a proxy for explanation quality might be the depth of the tree. Shorter trees would get a better explainability rating. It would make sense to add the constraint that the predictive performance of the tree remains good and does not drop too much compared to a larger tree.

The next chapter focuses on function level evaluation of explanations for individual predictions. What are the relevant properties of explanations that we would consider for their evaluation?