Chapter 1: Interpretable ML Introduction

This chapter introduces the basic concepts of Interpretable Machine Learning. We focus on supervised learning, explain the different types of explanations, repeat the topics correlation and interaction.

§1.01: Introduction, Motivation, and History



§1.02: Interpretation Goals



§1.03: Dimensions of Interpretability



§1.04: Correlation and Dependencies

Pearsons Correlation Coefficient:

\[\rho (X_1, X_2) = \frac{\sum_{i=1}^{n} (x_1^i - \bar{x_1})(x_2^i - \bar{x_2})}{\sqrt(\sum_{i=1}^n x_1^i - \bar{x_1} \sum_{i=1}^n x_2^i - \bar{x_2})} \in [-1, 1]\]

So if positive areas dominate then the correlation coefficient will be positive, if negative areas dominate then it will be negative. If the areas are equal then $\rho = 0$ which implies uncorrelated features.

Coefficient of determination $R^2$:

Dependence:

Mutual Information


§1.05: Interaction


Effect curve of a function with feature interactions Effect curve of a function with no feature interactions