您的位置：首页 > Web前端

caffe代码阅读：layer类和net类

2016-08-22 10:26 274 查看

这两个类是caffe框架的基石，从名字上就看得出来，深度学习就是围绕这两个东西展开的，还是从代码去看具体实现。1.layerlayer类有五大种类，每个种类里又有详细按作用区分，但全是从一个基类Layer继承过来，下面是具体的五类Data LayersCommon LayersActivation / Neuron LayersLoss LayersVision Layers基类layer里的主要成员变量和函数，结合caffe的英文注释看。

protected:
/** The protobuf that stores the layer parameters */
LayerParameter layer_param_;
/** The phase: TRAIN or TEST */
Phase phase_;
/** The vector that stores the learnable parameters as a set of blobs. */
vector<shared_ptr<Blob<Dtype> > > blobs_;//blobs_[0]是weights，blobs_[1]是bias
/** Vector indicating whether to compute the diff of each param blob. */
vector<bool> param_propagate_down_;//是否根据反馈更新

/** The vector that indicates whether each top blob has a non-zero weight in
*  the objective function. */
vector<Dtype> loss_;//这个应该是只有最后的softmax之类的层才会有，每个输出对应的具体loss值

/** Device context */
DeviceContext *device_context_;

/** @brief Using the CPU device, compute the layer output. */
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
/**
* @brief Using the GPU device, compute the layer output.
*        Fall back to Forward_cpu() if unavailable.
*/
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// LOG(WARNING) << "Using CPU code as backup.";
Forward_cpu(bottom, top);
}

/**
* @brief Using the CPU device, compute the gradients for any parameters and
*        for the bottom blobs if propagate_down is true.
*/
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) = 0;
/**
* @brief Using the GPU device, compute the gradients for any parameters and
*        for the bottom blobs if propagate_down is true.
*        Fall back to Backward_cpu() if unavailable.
*/
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
// LOG(WARNING) << "Using CPU code as backup.";
Backward_cpu(top, propagate_down, bottom);
}

/**
* Called by the parent Layer's SetUp to check that the number of bottom
* and top Blobs provided as input match the expected numbers specified by
* the {ExactNum,Min,Max}{Bottom,Top}Blobs() functions.
*/
virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
<span style="white-space:pre">	</span>//就是在检查bottom和top的size是否正确，具体代码就删掉了
}

/**
* Called by SetUp to initialize the weights associated with any top blobs in
* the loss function. Store non-zero loss weights in the diff blob.
*/
inline void SetLossWeights(const vector<Blob<Dtype>*>& top) {
const int num_loss_weights = layer_param_.loss_weight_size();
if (num_loss_weights) {
CHECK_EQ(top.size(), num_loss_weights) << "loss_weight must be "
"unspecified or specified once per top blob.";
for (int top_id = 0; top_id < top.size(); ++top_id) {
const Dtype loss_weight = layer_param_.loss_weight(top_id);
if (loss_weight == Dtype(0)) {continue;}
this->set_loss(top_id, loss_weight);
const int count = top[top_id]->count();
Dtype* loss_multiplier = top[top_id]->mutable_cpu_diff();
caffe_set(count, loss_weight, loss_multiplier);
}
}
}

SetUp函数的具体实现

  /**
   * @brief Implements common layer setup functionality.
   *
   * @param bottom the preshaped input blobs
   * @param top
   *     the allocated but unshaped output blobs, to be shaped by Reshape
   *
   * Checks that the number of bottom and top blobs is correct.
   * Calls LayerSetUp to do special layer setup for individual layer types,
   * followed by Reshape to set up sizes of top blobs and internal buffers.
   * Sets up the loss weight multiplier blobs for any non-zero loss weights.
   * This method may not be overridden.
   */

void SetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
CheckBlobCounts(bottom, top);//检查size大小
LayerSetUp(bottom, top);//调用子类实现的接口<pre name="code" class="cpp"><pre name="code" class="cpp">    //这里多说一下，以BaseConvolutionLayer的LayerSetUp为例，主要做了两件事，（1）根据layer_param_设置pad size、kernel size等等   （2）如果blobs_没初始化（size==0），那么用filler去填充blobs_（参数在layer_param_里）

<pre name="code" class="cpp">    Reshape(bottom,top);//根据已经分配好、shape好的bottom去shape分配好的top</span>

SetLossWeights(top);//设置loss（仅为不为0的blob设置）
}

2.net基本参数，看注释就行了/// @brief The network namestring name_;/// @brief The phase: TRAIN or TESTPhase phase_;/// @brief Individual layers in the netvector<shared_ptr<Layer<Dtype> > > layers_;vector<string> layer_names_;map<string, int> layer_names_index_;vector<bool> layer_need_backward_;/// @brief the blobs storing intermediate results between the layer.vector<shared_ptr<Blob<Dtype> > > blobs_;vector<string> blob_names_;map<string, int> blob_names_index_;vector<bool> blob_need_backward_;/// bottom_vecs stores the vectors containing the input for each layer./// They don't actually host the blobs (blobs_ does), so we simply store/// pointers.vector<vector<Blob<Dtype>*> > bottom_vecs_;vector<vector<int> > bottom_id_vecs_;vector<vector<bool> > bottom_need_backward_;/// top_vecs stores the vectors containing the output for each layervector<vector<Blob<Dtype>*> > top_vecs_;vector<vector<int> > top_id_vecs_;其它的函数最主要就是前向反馈，这不用写了，搞懂Init（）就行了。直接摘一段这函数的介绍，代码不贴了（出自http://blog.csdn.net/u014114990/article/details/47415051）

Init(const NetParameter& in_param)

功能：初始化网络输入：NetParameter& in_param 输出：无步骤： <1> 调用InsertSplits()函数从in_param读入新网络到param <2> 定义name_，blob_name_to_idx，available_blobs，num_layers <3> param.input_size()返回输入层blob的个数; param.input(i)表示第i个blob的名字; param.layers_size()返回网络的层数。 <4> 对每一个输入层的blob：产生一块和当前blob一样大的空间 e.g. imput_dim=[12 55 66 39 20 24 48 64]表示第一个blob的四个维数为 12 55 66 39，第二个为 20 24 48 64 接着blob_pointer指向这块空间blob_pointer压到blobs_中

vector<shared_ptr<Blob<Dtype>>> blobs_

blob_name压到blob_names_中

vector<string> blob_names_

param.force_backward()压到blob_need_backward_中

vector<bool> blob_need_backward_

i 压到 net_input_blob_indices_中 net_input_blob_indices_ -> vectorblob_pointer.get() 压到 net_input_blobs_中注意与blobs_的区别

vector<shared_ptr<Blob<Dtype>>> blobs_

vector<Blob<Dtype>*> net_input_blobs_

shared_ptr类型的参数调用.get()则得到Blob*类型

map<string, int> blob_name_to_idx

初始化为输入层的每个blob的名字

set<string> available_blobs

计算所需内存

memory_used += blob_pointer->count()

<5> 存每一层的输入blob指针

vector<vector<Blob<Dtype>*> > bottom_vecs_

存每一层输入(bottom)的id

vector<vector<int> > bottom_id_vecs_

存每一层输出(top)的blob

vector<vector<Blob<Dtype>*> > top_vecs_

用网络的层数param.layers_size()去初始化上面四个变量

vector<vector<int> > top_id_vecs_

<6> 对第i层（很大的一个for循环）：param.layers(i)返回的是关于第当前层的参数：

layer_param = param.layers(i)

把当前层的参数转换为

shared_ptr<Layer<Dtype>>

，并压入到layers_中把当前层的名字压入到layer_names_：

vector<string> layer_names_

判断当前层是否需要反馈

need_backward = param.force_backward()

下面开始产生当前层：分为处理bottom的blob和top的blob两个步骤对第j个bottom的blob：layer_param.bottom_size()存的是当前层的输入blob数量layer_param.bottom(j)存的是第j个输入blob的名字读取当前blob的id，其中blob_name_to_idx在输入层初始化过了

blob_name_to_idx[blob_name] = i

输出当前blob的名字存入第j个输入blob的指针

bottom_vecs_[i].push_back(blobs_[blob_id].get())

存入第j个输入blob的id

bottom_id_vecs_[i].push_back(blob_id)

更新need_backward从available_blobs中删除第j个blob的名字对第j个top的blob：layer_param.top_size()存的是当前层的输出blob数量layer_param.top(j)存的是第j个输出blob的名字判断是否进行同址计算输出当前blob的名字定义一块新的blob空间，用blob_pointer指向这块空间把这个指针存入到blobs_中把blob_name、force_backward、idx存入对应的容器中向available_blobs插入当前blob的名字top_vecs_[i]对于第i层，插入当前blob的指针top_id_vecs_[i]对于第i层，插入当前blob的id输出当前层位于top的blob的信息计算所需内存判断当前层i是否需要backward<7> 所有名字在available_blobs中的blob为当前层的输出blob，存入

net_output_blobs_

中 <8> 建立每个blob的name和index的对应关系map：blob_names_index_ <9> 建立每个层的name和index的对应关系map：layer_names_index_ <10> 调用GetLearningRateAndWeightDecay函数3.solver这个比较少，懒得在开一篇，写在这吧•Solver是整个训练过程的解决方案，它的作用包括： (1)创建训练网络、对网络进行评估； (2) 调用forward/backward迭代优化和更新参数； (3) 定期评估测试网络；•Solver的每一次迭代执行： (1) 调用网络forward计算输出和loss； (2) 调用网络backward计算梯度； (3) 按照solver方法，采用渐变进行参数更新； (4) 按照学习率、历史和方法更新solver状态。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： caffe deep learning 深度学习

相关文章推荐

新的分享

章节导航