notes

A collection of my lecture notes. Latest update Jul 2025

01 Chapter 1: Interpretable ML Introduction This chapter introduces the basic concepts of Interpretable Machine Learning. We focus on supervised learning, explain the different types of explanations, repeat the topics correlation and interaction. 5 sections

02 Chapter 2: Interpretable Models Some machine learning models are already inherently interpretable, e.g. simple LMs, GLMs, GAMs and rule-based models. These models are briefly summarized and their interpretation clarified. 6 sections

03 Chapter 3: Feature Effects Feature Effects indicate the change in prediction due to changes in feature values. This chapter explains the feature effects methods ICE curves, PDP and ALE plots. 5 sections

04 Chapter 4: Functional Decomposition This chapter focuses on understanding how ML models make predictions by breaking down their behavior into simpler, interpretable components. This is achieved through the concept of Functional Decomposition, with specific methods like Classical Functional ANOVA (fANOVA) and Friedman's H-Statistic. 4 sections

05 Chapter 5: Shapley Shapley values originate from classical game theory and aim to fairly devide a payout between players. In this section a brief explanation of Shapley values in game theory is given, followed by an adaption to IML resulting in the method SHAP. 3 sections

06 Chapter 6: Local Interpretable Model-agnostic Explanations (LIME) A common approach to interpret an ML model locally is implemented by LIME. The basic idea is to fit a surrogate model while focussing on data points near the observation of interest. The resulting model should be an inherently interpretable one. 4 sections

07 Chapter 7: Counterfactual Explanations This chapter deals with further local analyses. First, counterfactuals are examined which search for data points in the neighborhood of an observation that lead to a different prediction. 2 sections

08 Chapter 8: Feature Importance Methods belonging to this category aim to rank the features according to their influence on the predictive performance of an ML model. Depending on the interpretation goal, these methods are more or less suitable. 2 sections

01 Chapter 11: Advanced Risk Minimization This chapter revisits the theory of risk minimization, providing more in-depth analysis on established losses and the connection between empirical risk minimization and maximum likelihood estimation. 13 sections

02 Chapter 12: Multiclass Classification This chapter treats the multiclass case of classification. Tasks with more than two classes preclude the application of some techniques studied in the binary scenario and require an adaptation of loss functions. 3 sections

03 Chapter 13: Information Theory This chapter covers basic information-theoretic concepts and discusses their relation to machine learning. 8 sections

04 Chapter 15: Regularisation Regularization is a vital tool in machine learning to prevent overfitting and foster generalization ability. This chapter introduces the concept of regularization and discusses common regularization techniques in more depth. 12 sections

05 Chapter 16: Linear Support Vector Machine This chapter introduces the linear support vector machine (SVM), a linear classifier that finds decision boundaries by maximizing margins to the closest data points, possibly allowing for violations to a certain extent. 5 sections

06 Chapter 17: Nonlinear Support Vector Machines Many classification problems warrant nonlinear decision boundaries. This chapter introduces nonlinear support vector machines as a crucial extension to the linear variant. 6 sections

07 Chapter 18: Boosting This chapter introduces boosting as a sequential ensemble method that creates powerful committees from different kinds of base learners. 12 sections

01 Chapter 1: Introduction & Multi-Armed Bandits This chapter introduces the fundamental concepts of Reinforcement Learning including its key characteristics of trial-and-error search and delayed rewards. It also introduces Multi Armed Bandits, exploration-exploitation tradeoffs, and various methods for action-value estimation 9 sections

02 Chapter 2: Finite Markov Decision Processes This chapter explores the fundamental concepts of Markov Decision Processes (MDPs) covering the agent-environment interface, goals and rewards, returns and episodes, and policies and value functions. 6 sections

03 Chapter 4: Temporal-Difference Learning This chapter covers Temporal-Difference (TD) learning methods that combine ideas from Monte Carlo and dynamic programming, enabling agents to learn directly from raw experience without a model of the environment. 4 sections

04 Chapter 5: n-step Bootstrapping This section explains n-step bootstrapping techniques, which generalize TD learning by updating value estimates using returns accumulated over multiple steps, balancing bias and variance in learning. 4 sections

05 Chapter 6: Function Approximation, Deep Q-Networks, Expected SARSA This chapter discusses function approximation methods for scaling RL to large state spaces, including Deep Q-Networks (DQN) for learning value functions with neural networks and the Expected SARSA algorithm for stable policy evaluation. 5 sections

06 Chapter 7: Policy Gradient Algorithms, REINFORCE, Actor-Critic Algorithms, DPG, Hierarchical RL This section introduces policy gradient methods for directly optimizing policies, detailing the REINFORCE algorithm, Actor-Critic frameworks, Deterministic Policy Gradient (DPG), and approaches to hierarchical reinforcement learning for complex task decomposition. PDF notes

01 Large Language Models Transformers, Attention, Positional Encoding, BERT, BART, GPT, Pre-Training & Finetuning, Decoding Strategies, Tokenization, Data, Fast Attention Mechanisms, LoRA, Fast Inference Mechanism. PDF notes