Chapter 16: Linear Support Vector Machine

This chapter introduces the linear support vector machine (SVM), a linear classifier that finds decision boundaries by maximizing margins to the closest data points, possibly allowing for violations to a certain extent.

§16.01: Linear Hard Margin SVM (Primal)


Different possibilities of linear classifiers
\[\max_{\theta, \theta_0} \text{width}(\theta) = \max_{\theta, \theta_0} \frac{2}{||\theta||^2}\\ \text{ s.t. } y^{(i)}(\theta^T x^{(i)} + \theta_0) \geq 1\]

§16.02: Linear Hard Margin SVM (Dual)



§16.03: Linear Soft Margin SVM


Dual formulation of Linear soft margin SVM

Complimentary Slackness Conditions

Let $(\theta^\ast, \theta_0^\ast, \epsilon^*)$ be the primal solutions and $(\alpha^\ast, \mu^\ast)$ be the dual solutions. Then at optimality, the following complimentary slackness conditions hold:

  1. $\alpha_i^\ast(1-y^{(i)}(\theta^{*T}x^{(i)} + \theta_0) - \epsilon^{(i)\ast}) = 0$
  2. $\mu_i^\ast(\epsilon^{(i)\ast}) = 0$

Case 1: $\alpha_i^\ast = 0$

Then $\mu_i^\ast = C$. From CS 2 we get, $\epsilon^{(i)\ast} = 0$ i.e. these points don’t “pay a bribe”. Furthermore, from CS 1, $1-y^{(i)}(\theta^{\ast T}x^{(i)} + \theta_0) - \epsilon^{(i)\ast} \leq 0 \Rightarrow y^{(i)}(\theta^{\ast T}x^{(i)} + \theta_0) \geq 1$ i.e. these points are classified correctly. They either are on the supporting hyperplane or are classified correctly and are away from the supporting hyperplane.

Case 2: $\alpha_i^\ast \in (0, C)$

We get that $\mu_i^\ast > 0 \Rightarrow_{CS 2} \epsilon^{(i)\ast} = 0$ i.e. these points also don’t “pay a bribe”. Furthermore from CS 1 we get $1-y^{(i)}(\theta^{\ast T}x^{(i)} + \theta_0) = 0$ i.e. these points lie exactly on the supporting hyperplane.

Case 3: $\alpha_i^\ast = C$

From CS 1 we get that $1-y^{(i)}(\theta^{\ast T}x^{(i)} + \theta_0) + \epsilon^{(i)\ast} = 0 \Rightarrow \epsilon^{(i)\ast} = 1-y^{(i)}(\theta^{\ast T}x^{(i)} + \theta_0)$ i.e. these points “pay a bribe”. This implies that these points are either classified correctly with not enough margin or are classified incorrectly.


§16.04: SVMs and ERM



§16.05: SVM Training