Buy Book Buy

Chapter 2 Introduction

This book explains to you how to make (supervised) machine learning models interpretable. The chapters contain some mathematical formulas, but you should be able to understand the ideas behind the methods even without the formulas. This book is not for people trying to learn machine learning from scratch. If you are new to machine learning, there are a lot of books and other resources to learn the basics. I recommend the book “The Elements of Statistical Learning” by Hastie, Tibshirani, and Friedman (2009) 1 and Andrew Ng’s “Machine Learning” online course on the online learning platform coursera.com to start with machine learning. Both the book and the course are available free of charge!

New methods for the interpretation of machine learning models are published at breakneck speed. To keep up with everything that is published would be madness and simply impossible. That is why you will not find the most novel and fancy methods in this book, but established methods and basic concepts of machine learning interpretability. These basics prepare you for making machine learning models interpretable. Internalizing the basic concepts also empowers you to better understand and evaluate any new paper on interpretability published on arxiv.org in the last 5 minutes since you began reading this book (I might be exaggerating the publication rate).

This book starts with some (dystopian) short stories that are not needed to understand the book, but hopefully will entertain and make you think. Then the book explores the concepts of machine learning interpretability. We will discuss when interpretability is important and the different types of explanations that exist. Terms used throughout the book can be looked up in the Terminology chapter. Most of the models and methods explained are presented using real data examples which are described in the Data chapter. One way to make machine learning interpretable is to use interpretable models, such as linear models or decision trees. The other option is the use of model-agnostic interpretation tools that can be applied to any supervised machine learning model. Model-agnostic methods can be divided into global methods that describe the average behavior of the model, and local methods that explain individual predictions. The Model-Agnostic Methods chapter deals with methods such as partial dependence plots and feature importance. Model-agnostic methods work by changing the input of the machine learning model and measuring changes in the prediction output. The book ends with an optimistic outlook on what the future of interpretable machine learning might look like.

You can either read the book from beginning to end or jump directly to the methods that interest you.

I hope you will enjoy the read!


  1. Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. “The elements of statistical learning”. hastie.su.domains/ElemStatLearn (2009).↩︎