您的位置:首页 > Web前端

Caffe框架的理解(二):详解AlexNet

2017-02-22 08:52 441 查看

引言

在2012年的时候,Geoffrey和他学生Alex为了回应质疑者,在ImageNet的竞赛中利用AlexNet一举刷新image classification的记录,奠定了deep learning 在计算机视觉中的地位。这里将利用对这一模型的分析学习caffe的结构。

AlexNet的模型结构

模型的文件在根目录的models/bvlc_reference_caffenet/deploy.prototxt,内容如附录一所示。利用draw_net.py可获得模型的结构图像,输入命令:

python python/draw_net.py models/bvlc_reference_caffenet/deploy-gph.prototxt examples/AlexNet-gph/pic/alexnet.png --rankdir=TB --phase=ALL


得到的图像如附录二所示。

逐层分析模型

1.各层的数据结构

在终端中输入如下命令,准备相关环境:

gph@gph-pc:~ $ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import caffe
>>> import cv2
>>> import cv2.cv as cv
>>> caffe.set_mode_gpu()
>>> caffe_root = '/home/gph/Desktop/caffe-ssd/'
>>> model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy-gph.prototxt'
>>> model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
>>> img_file = caffe_root + 'examples/images/cat.jpg'
>>>


加载模型:

>>> net = caffe.Net(model_def, model_weights, caffe.TEST)


显示所有的层以及对应的data和diff的维度:

输入命令:

for layer, blob in net.blobs.iteritems():
...   print layer + ' ' + str(blob.data.shape) + ' ' + str(blob.diff.shape)
...


输出如下:

data (10, 3, 227, 227) (10, 3, 227, 227)
conv1 (10, 96, 55, 55) (10, 96, 55, 55)
pool1 (10, 96, 27, 27) (10, 96, 27, 27)
norm1 (10, 96, 27, 27) (10, 96, 27, 27)
conv2 (10, 256, 27, 27) (10, 256, 27, 27)
pool2 (10, 256, 13, 13) (10, 256, 13, 13)
norm2 (10, 256, 13, 13) (10, 256, 13, 13)
conv3 (10, 384, 13, 13) (10, 384, 13, 13)
conv4 (10, 384, 13, 13) (10, 384, 13, 13)
conv5 (10, 256, 13, 13) (10, 256, 13, 13)
pool5 (10, 256, 6, 6) (10, 256, 6, 6)
fc6 (10, 4096) (10, 4096)
fc7 (10, 4096) (10, 4096)
fc8 (10, 1000) (10, 1000)
prob (10, 1000) (10, 1000)


显示有权重的层

输入命令:

for layer, param in net.params.iteritems():
print layer + ' ' + str(param[0].data.shape) + ' ' + str(param[1].data.shape)


输出结果为:

conv1 (96, 3, 11, 11) (96,)
conv2 (256, 48, 5, 5) (256,)
conv3 (384, 256, 3, 3) (384,)
conv4 (384, 192, 3, 3) (384,)
conv5 (256, 192, 3, 3) (256,)
fc6 (4096, 9216) (4096,)
fc7 (4096, 4096) (4096,)
fc8 (1000, 4096) (1000,)


2.分析

在caffe中存在两种数据流动:

一种是需要处理的数据,从输入层输入,被各层一次处理,最后到输出层得到输出。这部分数据存储在net.blobs的data中;同时,blob中diff还保存着对应的梯度值,这是我们比较关心的两种数据。

另一种是各层计算时需要用到的参数,也就是权重weights和偏置bias项,存储在net.params[0]和net.params[1]中。

在AlexNet模型中,拥有卷积处理性质的层都会改变处理数据的大小。比如卷积层和池化层,他们都包含kernel_size这一参数,所以有可能改变数据的大小。是否能够改变数据的大小,要看的与卷积相关的参数有kernel_size,pad,stride。计算公式如下y=(x+2*pad-kernel_size)/stride+1

卷积层所存储的权重的维度由该层的kernel_size,group,num_output以及上一层的num_output决定,其值为(本层的num_output,上一层的num_output/group,kernel_size_w,kernel_size_h)。其原因和卷积是如歌进行的相关

全连接层的权重就简单很多了,因为是全链接,参数就和上一层的神经元个数,输出的神经元个数相关,为(num_output,上一层的神经元个数)

附录一:deploy.prototxt内容

name: "CaffeNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
inner_product_param {
num_output: 1000
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc8"
top: "prob"
}


附录二:AlexNet模型

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: