1. Background
Given input dimension \(d\), input data \(x\), \(y\) can be expressed as \(\vec{x} = (x_1, ... , x_d)\), \(\vec{y} = (y_1, ..., y_d)\), an SVM kernel function \(k(\vec{x}, \vec{y})\) can be expressed as the dot product of the transformation of \(\vec{x}\), \(\vec{y}\) by a transformation function \(\gamma\), that is to say
\[k(\vec{x}, \vec{y}) = \gamma(\vec{x}) \cdot \gamma(\vec{y})\]
2. Problems
Derive the transformation function \(\gamma\) for the following SVM kernels, also compute the VC Dimension for the SVM model based on these kernels.
- \(k(\vec{x}, \vec{y}) = (\vec{x} \cdot \vec{y})^n\)
- \(k(\vec{x}, \vec{y}) = (\vec{x} \cdot \vec{y} + 1)^n\)
- \(k(\vec{x}, \vec{y}) = e^{-\sigma(||\vec{x}||^2 + ||\vec{y}||^2)^n}\)
3. \(k(\vec{x}, \vec{y}) = (\vec{x} \cdot \vec{y})^n\)
根据多项式展开公式
\[k(\vec{x}, \vec{y}) = (\vec{x} \cdot \vec{y})^n = (x_1y_1 + x_2y_2 + \cdots + x_dy_d) ^ n = \sum{\frac{n!}{n_1!n_2!\cdots n_d!}{(x_1y_1)}^{n_1}{(x_2y_2)}^{n_2} \cdots {(x_dy_d)}^{n_d}}\]
所以
\[\gamma(\vec{x}) = [\sqrt{\frac{n!}{n_1!n_2!\cdots n_d!}}{x_1}^{n_1}{x_2}^{n_2} \cdots {x_d}^{n_d}, n_1+n_2+\cdots + n_d = n]\]
容易验证
\[\gamma(\vec{x}) \cdot \gamma(\vec{y}) = k(\vec{x}, \vec{y})\]
因此,映射到的维数是和多项式的项数相关的,项数为\(m=\tbinom{n+d-1}{d}\),即\(\mathcal{H} \in \mathbb{R}^{m}\)
该超平面的VC维为\(\tbinom{n+d-1}{d} + 1\)
4. \(k(\vec{x}, \vec{y}) = (\vec{x} \cdot \vec{y} + 1)^n\)
此处相当于扩展
\[\vec{x'} = [\vec{x}, 1]\]
\[\vec{y'} = [\vec{y}, 1]\]
\[k(\vec{x}, \vec{y}) = (\vec{x}\cdot \vec{y} + 1)^n = (\vec{x'}\cdot \vec{y'})^n\]
其中,
\[\vec{x'} \in \mathbb{R}^{d+1}, \vec{y'} \in \mathbb{R}^{d+1}\]
根据3中的结论,
\[\gamma(\vec{x'}) = [\sqrt{\frac{n!}{n_1!n_2!\cdots n_d!n_{d+1}!}}{x_1}^{n_1}{x_2}^{n_2} \cdots {x_d}^{n_d} {x_{d+1}}^{n_{d+1}}, n_1+n_2+\cdots + n_d + n_{d+1}= n]\]
其中\(x_{d+1} = 1\)
该超平面的VC维为\(\tbinom{n+d}{d+1} + 1\)
5. \(k(\vec{x}, \vec{y}) = e^{-\sigma(||\vec{x}||^2 + ||\vec{y}||^2)^n}\)
\[k(\vec{x}, \vec{y}) = e^{-\sigma(||\vec{x}||^2 + ||\vec{y}||^2)^n} = e^{-\sigma(\sum\limits_{i=1}^{d}{(x_i^2+y_i^2)})^n}=e^{-\sigma((x_1^2+y_1^2)+\cdots(x_d^2+y_d^2))^n}\]
\[h(\vec{x}, \vec{y}) = ((x_1^2+y_1^2)+\cdots + (x_d^2+y_d^2))^n=\sum{\frac{n!}{n_1!n_2!\cdots n_{2d}!}{x_1}^{n_1}{x_2}^{n_2} \cdots {x_d}^{n_d}{y_1}^{n_{d+1}}{y_2}^{n_{d+2}} \cdots {y_d}^{n_{2d}}}\]
\[k(\vec{x}, \vec{y}) = e^{-\sigma h(\vec{x},\vec{y})}\]