Chapter 6: Local Interpretable Model-agnostic Explanations (LIME)

A common approach to interpret an ML model locally is implemented by LIME. The basic idea is to fit a surrogate model while focussing on data points near the observation of interest. The resulting model should be an inherently interpretable one.

§6.01: Introduction to Local Explanations


TO-DO


§6.02: Local Interpretable Model-agnostic Explanations (LIME)


LIME (Local Interpretable Model-Agnostic Explanations) is a popular technique for explaining the predictions of any black-box machine learning model. Its primary objective is to provide local, human-understandable explanations for individual predictions, regardless of the underlying model type or complexity. LIME is called “model-agnostic” because it treats the original model as a black box, requiring no access to its internal structure, parameters, or training process. This universality makes LIME applicable to any machine learning model, from simple linear regression to complex deep neural networks.

LIME Algorithm

For a pre-trained black-box model \(\hat{f}\), observation \(x\) whose prediction we wish to explain, and an interpretable model class \(G\):

  1. Independently sample new points \(z \in Z\) by slightly modifying the features of the selected instance. For tabular data, this might mean sampling from the feature’s distribution; for images, it could mean turning superpixels on or off; for text, removing or replacing words.
  2. Retrieve prediction \(\hat{f}(z)\) for obtained points \(z\) from the black-box model.
  3. Calculate the weights for \(z \in Z\) by their proximity measure \(\phi_x(z)\) to quantify closeness to \(x\).
  4. Train interpretable surrogate model \(\hat{g}\) on data points \(z \in Z\) using weights \(\phi_x(z)\) and \(\hat{f}(z)\) as targets.
  5. Return \(\hat{g}\) as the local explanation for \(\hat{f}\).

§6.03: LIME Examples


TO-DO


§6.04:LIME Pitfalls


While LIME is a powerful and intuitive tool for explaining the predictions of complex “black-box” models, it’s crucial to understand its limitations to use it responsibly as its methodology gives rise to several significant challenges.

1. Sampling: Ignoring Feature Dependencies and Extrapolation Risks

2. Locality Definition: Sensitivity to Kernel Width and Distance Metrics

3. Local vs. Global Features: Overshadowing of Local Signals

4. Poor Fidelity for Complex Models

5. Hiding Biases

6. Robustness: Unstable Explanations for Similar Points

7. Superpixels (for Images): Instability from Segmentation