Basic Info
Link: ImageNet Classification with Deep Convolutional Neural Networks - NIPS
Author:
- Alex Krizhevsky, University of Toronto
- Ilya Sutskever, University of Toronto
- Geoffrey E. Hinton, University of Toronto
Summary
Research Objective
作者的研究目标。
Problem Statement
问题陈述,要解决什么问题?
Method(s)
解决问题的方法/算法是什么?
Evaluation
作者如何评估自己的方法,有没有问题或者可以借鉴的地方。
Conclusion
作者给了哪些strong conclusion, 又给了哪些weak conclusion?
Notes
1. top-5 error
The fraction of test images for which the correct label is not among the five labels considered most probable by the model.
2. Dataset
First rescaled the image such that the shorter side was of length 256, and then cropped out the central 256*256 patch from the resulting image.
did not pre-process the images in any other way, except for subtracting the mean activity over the training set from each pixel.
3. Architecutre
five convolutional and three fully-connected
3.1 Relu Nonlinearity
standard way to model a neuron's output \(f\) as a function of its input \(x\) is with \(f(x) = tanh(x)\) or \(f(x) = (1 + e^{-x})^{-1}\)
much slower than the non-saturating nonlinearity \(f(x) = max(0, x)\), refer as Rectified Linear Units (ReLUs)
- 网络训练更快
- 增加网络非线性
- 防止梯度消失(弥散)
- 使网络具有稀释性
Sigmoid,反向传播时梯度消失
\[f(x) = \frac{1}{1 + e^{-x}}\]
\[f(x)'=\frac{1}{(1+e^{-x})^2} * (e^{-x})=\frac{1}{1+e^{-x}}*\frac{e^{-x}}{1+e^{-x}}=f(x)*(1-f(x))\]
3.2 参数计算公式
卷积特征图计算公式,\(F_o = \lfloor \frac{F_{in} + 2p - k}{s} \rfloor + 1\)
卷积方式为VALID时,\(F_o = \lceil \frac{F_{in} - k + 1}{s} \rceil\)
卷积方式为SAME,\(F_o = \lceil \frac{F_{in}}{s} \rceil\)
连接数量计算公式
\[F_o^2 \times (K^2 \times K_c + 1) \times F_{oc}\]
\[输出特征尺寸 \times (卷积核大小 \times 卷积核通道数 + 1) \times 输出特征图通道数\]
3.3 Network Architecture
虽然AlexNet网络都用上图的结构来表示,但是其实输入图像的尺寸不是2242243,而是2272273,如果用224的尺寸,会发现边界填充的结果是小数,显然不对,简化后
参数计算
网络层 | kernal_size | deepth | stride | padding | input_size | output_size | parameters | connections |
---|---|---|---|---|---|---|---|---|
conv1 | [11, 11] | 96 | 4 | VALID | [227, 227, 3] | \(\lceil \frac{227 - 11 + 1}{4} \rceil = 55\), [55, 55, 96] | \((11\times 11 \times 3 + 1)\times 96\) | (55x55)x parameters |
pool1 | [3, 3] | 96 | 2 | VALID | [55, 55, 96] | [27, 27, 96] | -- | |
conv2_1 | [5, 5] | 128 | 1 | SAME | [27, 27, 96] | [27, 27, ] | ||