机器学习笔记 ---- Support Vector Machines

Support Vector Machines

1. Cost Function

J (θ) = C [\sum_{i = 1}^{m} y^{(i)} {cost}_{1} (θ^{T} x^{(i)}) + (1 - y^{(i)}) {cost}_{0} (θ^{T} x^{(i)})] + \frac{1}{2} \sum_{j = 1}^{n} θ_{j}^{2}

2. Hypothesis

f (x) = {\begin{cases} 1 & if θ^{T} x>=0 \\ 0 & otherwise \end{cases}

3. Margin of SVM

机器学习笔记 ---- Support Vector Machines

4. Kernels

Define landmarks $l$
机器学习笔记 ---- Support Vector Machines

Using $f$ and $θ$ when making predictions..

5. How to Get Landmarks

One way is to use the first m training examples.

6. The Effects of Parameters in SVM

1) For $C$
Large C : λ small, low bias, high variance
Small C : λ big, high bias, low variance
2) For $σ^{2}$
Large $σ^{2}$ : high bias, low variance
Small $σ^{2}$ : low bias, high variance

7. Choice of Kernal

Need to satisfy Mercer’s Theorem.
1) No kernal (Linear Kernal)
when n is large/ n is small && m is large
2) Gaussian Kernal
when n is small, m is intermediate
Need to use feature scaling before using!
3) Other Alternative Choices:
Polynomial Kernal, String Kernal, Chi-Square Kernal, Intersection Kernal…