Layer
Layer是所有层的基类,在Layer的基础上衍生出来的有5种Layers:
data_layer
neuron_layer
loss_layer
common_layer
vision_layer
它们都有对应的[.hpp .cpp]文件声明和实现了各个类的接口。下面一个一个地讲这5个Layer。
data_layer
先看data_layer.hpp中头文件调用情况:
1
2
3
4
5
6
7
8
9
10
11
12
| #include "boost/scoped_ptr.hpp"
#include "hdf5.h"
#include "leveldb/db.h"
#include "lmdb.h"
//前4个都是数据格式有关的文件
#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/data_transformer.hpp"
#include "caffe/filler.hpp"
#include "caffe/internal_thread.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"
|
不难看出data_layer主要包含与数据有关的文件。在官方文档中指出data是caffe数据的入口是网络的最低层,并且支持多种格式,在这之中又有5种LayerType:
DATA
MEMORY_DATA
HDF5_DATA
HDF5_OUTPUT
IMAGE_DATA
其实还有两种
WINDOW_DATA
,
DUMMY_DATA
用于测试和预留的接口,这里暂时不管。
DATA
1
2
3
4
5
6
| template <typename Dtype>
class BaseDataLayer : public Layer<Dtype>
template <typename Dtype>
class BasePrefetchingDataLayer : public BaseDataLayer<Dtype>, public InternalThread
template <typename Dtype>
class DataLayer : public BasePrefetchingDataLayer<Dtype>
|
用于LevelDB或LMDB数据格式的输入的类型,输入参数有
source
,
batch_size
, (
rand_skip
), (
backend
)。后两个是可选。
MEMORY_DATA
1
2
| template <typename Dtype>
class MemoryDataLayer : public BaseDataLayer<Dtype>
|
这种类型可以直接从内存读取数据使用时需要调用
MemoryDataLayer::Reset
,输入参数有
batch_size
,
channels
,
height
,
width
。
HDF5_DATA
1
2
| template <typename Dtype>
class HDF5DataLayer : public Layer<Dtype>
|
HDF5数据格式输入的类型,输入参数有
source
,
batch_size
。
HDF5_OUTPUT
1
2
| template <typename Dtype>
class HDF5OutputLayer : public Layer<Dtype>
|
HDF5数据格式输出的类型,输入参数有
file_name
。
IMAGE_DATA
1
2
| template <typename Dtype>
class ImageDataLayer : public BasePrefetchingDataLayer<Dtype>
|
图像格式数据输入的类型,输入参数有
source
,
batch_size
, (
rand_skip
), (
shuffle
), (
new_height
), (
new_width
)。
neuron_layer
先看neuron_layer.hpp中头文件调用情况
1
23 4
| #include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"
|
同样是数据的操作层,neuron_layer实现里大量激活函数,主要是元素级别的操作,具有相同的
bottom
,
top
size。
Caffe中实现了大量激活函数GPU和CPU的都有很多。它们的父类都是
NeuronLayer
1
2
| template <typename Dtype>
class NeuronLayer : public Layer<Dtype>
|
这部分目前没什么需要深究的地方值得注意的是一般的参数设置格式如下(以ReLU为例):
1
2
3
4
5
6
| layers {
name: "relu1"
type: RELU
 bottom: "conv1"
 top: "conv1"
}
|
loss_layer
Loss层计算网络误差,loss_layer.hpp头文件调用情况:
1
23 4 5
| #include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer.hpp"
#include "caffe/neuron_layers.hpp"
#include "caffe/proto/caffe.pb.h"
|
可以看见调用了
neuron_layers.hpp
,估计是需要调用里面的函数计算Loss,一般来说Loss放在最后一层。caffe实现了大量loss function,它们的父类都是
LossLayer
。
1
2
| template <typename Dtype>
class LossLayer : public Layer<Dtype>
|
common_layer
先看common_layer.hpp头文件调用:
1
2
3
4
5
67
| #include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/data_layers.hpp"
#include "caffe/layer.hpp"
#include "caffe/loss_layers.hpp"
#include "caffe/neuron_layers.hpp"
#include "caffe/proto/caffe.pb.h"
|
用到了前面提到的
data_layers.hpp
,
loss_layers.hpp
,
neuron_layers.hpp
说明这一层肯定开始有复杂的操作了。
这一层主要进行的是
vision_layer
的连接
声明了9个类型的common_layer,部分有GPU实现:
InnerProductLayer
SplitLayer
FlattenLayer
ConcatLayer
SilenceLayer
(Elementwise Operations) 这里面是我们常说的激活函数层Activation Layers。
EltwiseLayer
SoftmaxLayer
ArgMaxLayer
MVNLayer
InnerProductLayer
常常用来作为全连接层,设置格式为:
1
2
3
4
5
6
7
8
9
10
11
1213 14 15 16 17 18 19 20 21
| layers {
name: "fc8"
type: INNER_PRODUCT
blobs_lr: 1 # learning rate multiplier for the filters
blobs_lr: 2 # learning rate multiplier for the biases
weight_decay: 1 # weight decay mu
weight_decay: 0 # weight decay multiplier for the biases
inner_product_param {
num_output: 1000
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
bottom: "fc7"
top: "fc8
}
|
SplitLayer
用于一输入对多输出的场合(对blob)
FlattenLayer
将n * c * h * w变成向量的格式n * ( c * h * w ) * 1 * 1
ConcatLayer
用于多输入一输出的场合。
1
2
3
4
5
67 8 9 10
| layers {
name: "concat"
bottom: "in1"
bottom: "in2"
top: "out"
type: CONCAT
concat_param {
concat_dim: 1
}
}
|
SilenceLayer
用于一输入对多输出的场合(对layer)
(Elementwise Operations)
EltwiseLayer
,
SoftmaxLayer
,
ArgMaxLayer
,
MVNLayer
vision_layer
头文件包含前面所有文件,也就是说包含了最复杂的操作。
1
2
3
4
5
67 8
| #include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/common_layers.hpp"
#include "caffe/data_layers.hpp"
#include "caffe/layer.hpp"
#include "caffe/loss_layers.hpp"
#include "caffe/neuron_layers.hpp"
#include "caffe/proto/caffe.pb.h"
|
它主要是实现Convolution和Pooling操作。主要有以下几个类。
1
2
3
4
5
67 8
| template <typename Dtype>
class ConvolutionLayer : public Layer<Dtype>
template <typename Dtype>
class Im2colLayer : public Layer<Dtype>
template <typename Dtype>
class LRNLayer : public Layer<Dtype>
template <typename Dtype>
class PoolingLayer : public Layer<Dtype>
|
ConvolutionLayer
最常用的卷积操作,设置格式如下
1
2
3
4
5
6
7
8
9
10
11
1213 14 15 16 17 18 19 20 21 22 23
| layers {
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
blobs_lr: 1 # learning rate multiplier for the filters
blobs_lr: 2 # learning rate multiplier for the biases
weight_decay: 1 # weight decay multiplier for the filters
weight_decay: 0 # weight decay multiplier for the biases
convolution_param {
num_output: 96 # learn 96 filters
kernel_size: 11 # each filter is 11x11
stride: 4 # step 4 pixels between each filter application
weight_filler {
type: "gaussian" # initialize the filters from a Gaussian
std: 0.01 # distribution with stdev 0.01 (default mean: 0)
}
bias_filler {
type: "constant" # initialize the biases to zero (0)
value: 0
}
}
}
|
Im2colLayer
与MATLAB里面的im2col类似,即image-to-column transformation,转换后方便卷积计算
LRNLayer
全称local response normalization layer,在Hinton论文中有详细介绍
ImageNet Classification with Deep Convolutional Neural Networks。
PoolingLayer
即Pooling操作,格式:
1
2
3
4
5
67 8 9 10 11
| layers {
name: "pool1"
type: POOLING
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3 # pool over a 3x3 region
stride: 2 # step two pixels (in the bottom blob) between pooling regions
}
}
|