您的位置:首页 > Web前端

caffe代码阅读:layer类和net类

2016-08-22 10:26 274 查看
这两个类是caffe框架的基石,从名字上就看得出来,深度学习就是围绕这两个东西展开的,还是从代码去看具体实现。1.layerlayer类有五大种类,每个种类里又有详细按作用区分,但全是从一个基类Layer继承过来,下面是具体的五类Data LayersCommon LayersActivation / Neuron LayersLoss LayersVision Layers基类layer里的主要成员变量和函数,结合caffe的英文注释看。
protected:
/** The protobuf that stores the layer parameters */
LayerParameter layer_param_;
/** The phase: TRAIN or TEST */
Phase phase_;
/** The vector that stores the learnable parameters as a set of blobs. */
vector<shared_ptr<Blob<Dtype> > > blobs_;//blobs_[0]是weights,blobs_[1]是bias
/** Vector indicating whether to compute the diff of each param blob. */
vector<bool> param_propagate_down_;//是否根据反馈更新

/** The vector that indicates whether each top blob has a non-zero weight in
*  the objective function. */
vector<Dtype> loss_;//这个应该是只有最后的softmax之类的层才会有,每个输出对应的具体loss值

/** Device context */
DeviceContext *device_context_;

/** @brief Using the CPU device, compute the layer output. */
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
/**
* @brief Using the GPU device, compute the layer output.
*        Fall back to Forward_cpu() if unavailable.
*/
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// LOG(WARNING) << "Using CPU code as backup.";
Forward_cpu(bottom, top);
}

/**
* @brief Using the CPU device, compute the gradients for any parameters and
*        for the bottom blobs if propagate_down is true.
*/
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) = 0;
/**
* @brief Using the GPU device, compute the gradients for any parameters and
*        for the bottom blobs if propagate_down is true.
*        Fall back to Backward_cpu() if unavailable.
*/
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
// LOG(WARNING) << "Using CPU code as backup.";
Backward_cpu(top, propagate_down, bottom);
}

/**
* Called by the parent Layer's SetUp to check that the number of bottom
* and top Blobs provided as input match the expected numbers specified by
* the {ExactNum,Min,Max}{Bottom,Top}Blobs() functions.
*/
virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
<span style="white-space:pre">	</span>//就是在检查bottom和top的size是否正确,具体代码就删掉了
}

/**
* Called by SetUp to initialize the weights associated with any top blobs in
* the loss function. Store non-zero loss weights in the diff blob.
*/
inline void SetLossWeights(const vector<Blob<Dtype>*>& top) {
const int num_loss_weights = layer_param_.loss_weight_size();
if (num_loss_weights) {
CHECK_EQ(top.size(), num_loss_weights) << "loss_weight must be "
"unspecified or specified once per top blob.";
for (int top_id = 0; top_id < top.size(); ++top_id) {
const Dtype loss_weight = layer_param_.loss_weight(top_id);
if (loss_weight == Dtype(0)) {continue;}
this->set_loss(top_id, loss_weight);
const int count = top[top_id]->count();
Dtype* loss_multiplier = top[top_id]->mutable_cpu_diff();
caffe_set(count, loss_weight, loss_multiplier);
}
}
}
SetUp函数的具体实现
  /**
   * @brief Implements common layer setup functionality.
   *
   * @param bottom the preshaped input blobs
   * @param top
   *     the allocated but unshaped output blobs, to be shaped by Reshape
   *
   * Checks that the number of bottom and top blobs is correct.
   * Calls LayerSetUp to do special layer setup for individual layer types,
   * followed by Reshape to set up sizes of top blobs and internal buffers.
   * Sets up the loss weight multiplier blobs for any non-zero loss weights.
   * This method may not be overridden.
   */
void SetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
CheckBlobCounts(bottom, top);//检查size大小
LayerSetUp(bottom, top);//调用子类实现的接口<pre name="code" class="cpp"><pre name="code" class="cpp">    //这里多说一下,以BaseConvolutionLayer的LayerSetUp为例,主要做了两件事,(1)根据layer_param_设置pad size、kernel size等等   (2)如果blobs_没初始化(size==0),那么用filler去填充blobs_(参数在layer_param_里)
<pre name="code" class="cpp">    Reshape(bottom,top);//根据已经分配好、shape好的bottom去shape分配好的top</span>
SetLossWeights(top);//设置loss(仅为不为0的blob设置)
}
2.net基本参数,看注释就行了/// @brief The network namestring name_;/// @brief The phase: TRAIN or TESTPhase phase_;/// @brief Individual layers in the netvector<shared_ptr<Layer<Dtype> > > layers_;vector<string> layer_names_;map<string, int> layer_names_index_;vector<bool> layer_need_backward_;/// @brief the blobs storing intermediate results between the layer.vector<shared_ptr<Blob<Dtype> > > blobs_;vector<string> blob_names_;map<string, int> blob_names_index_;vector<bool> blob_need_backward_;/// bottom_vecs stores the vectors containing the input for each layer./// They don't actually host the blobs (blobs_ does), so we simply store/// pointers.vector<vector<Blob<Dtype>*> > bottom_vecs_;vector<vector<int> > bottom_id_vecs_;vector<vector<bool> > bottom_need_backward_;/// top_vecs stores the vectors containing the output for each layervector<vector<Blob<Dtype>*> > top_vecs_;vector<vector<int> > top_id_vecs_;其它的函数最主要就是前向反馈,这不用写了,搞懂Init()就行了。直接摘一段这函数的介绍,代码不贴了(出自http://blog.csdn.net/u014114990/article/details/47415051)
Init(const NetParameter& in_param)
 功能:初始化网络 输入:NetParameter& in_param 输出:无 步骤: <1> 调用InsertSplits()函数从in_param读入新网络到param <2> 定义name_,blob_name_to_idx,available_blobs,num_layers <3> param.input_size()返回输入层blob的个数; param.input(i)表示第i个blob的名字; param.layers_size()返回网络的层数。 <4> 对每一个输入层的blob:产生一块和当前blob一样大的空间 e.g. imput_dim=[12 55 66 39 20 24 48 64]表示第一个blob的四个维数为 12 55 66 39,第二个为 20 24 48 64 接着blob_pointer指向这块空间blob_pointer压到blobs_中 
vector<shared_ptr<Blob<Dtype>>> blobs_
blob_name压到blob_names_中 
vector<string> blob_names_
param.force_backward()压到blob_need_backward_中 
vector<bool> blob_need_backward_
i 压到 net_input_blob_indices_中 net_input_blob_indices_ -> vectorblob_pointer.get() 压到 net_input_blobs_中 注意与blobs_的区别 
vector<shared_ptr<Blob<Dtype>>> blobs_
 
vector<Blob<Dtype>*> net_input_blobs_
 shared_ptr类型的参数调用.get()则得到Blob*类型
map<string, int> blob_name_to_idx
初始化为输入层的每个blob的名字 
set<string> available_blobs
计算所需内存 
memory_used += blob_pointer->count()
<5> 存每一层的输入blob指针 
vector<vector<Blob<Dtype>*> > bottom_vecs_
 存每一层输入(bottom)的id 
vector<vector<int> > bottom_id_vecs_
 存每一层输出(top)的blob 
vector<vector<Blob<Dtype>*> > top_vecs_
 用网络的层数param.layers_size()去初始化上面四个变量 
vector<vector<int> > top_id_vecs_
 <6> 对第i层(很大的一个for循环):param.layers(i)返回的是关于第当前层的参数: 
layer_param = param.layers(i)
把当前层的参数转换为
shared_ptr<Layer<Dtype>>
,并压入到layers_中把当前层的名字压入到layer_names_:
vector<string> layer_names_
判断当前层是否需要反馈 
need_backward = param.force_backward()
下面开始产生当前层:分为处理bottom的blob和top的blob两个步骤 对第j个bottom的blob:layer_param.bottom_size()存的是当前层的输入blob数量layer_param.bottom(j)存的是第j个输入blob的名字读取当前blob的id,其中blob_name_to_idx在输入层初始化过了 
blob_name_to_idx[blob_name] = i
输出当前blob的名字存入第j个输入blob的指针
bottom_vecs_[i].push_back(blobs_[blob_id].get())
存入第j个输入blob的id 
bottom_id_vecs_[i].push_back(blob_id)
更新need_backward从available_blobs中删除第j个blob的名字对第j个top的blob:layer_param.top_size()存的是当前层的输出blob数量layer_param.top(j)存的是第j个输出blob的名字判断是否进行同址计算输出当前blob的名字定义一块新的blob空间,用blob_pointer指向这块空间把这个指针存入到blobs_中把blob_name、force_backward、idx存入对应的容器中向available_blobs插入当前blob的名字top_vecs_[i]对于第i层,插入当前blob的指针top_id_vecs_[i]对于第i层,插入当前blob的id输出当前层位于top的blob的信息计算所需内存判断当前层i是否需要backward<7> 所有名字在available_blobs中的blob为当前层的输出blob, 存入
net_output_blobs_
中 <8> 建立每个blob的name和index的对应关系map:blob_names_index_ <9> 建立每个层的name和index的对应关系map:layer_names_index_ <10> 调用GetLearningRateAndWeightDecay函数3.solver这个比较少,懒得在开一篇,写在这吧•Solver是整个训练过程的解决方案,它的作用包括:     (1)创建训练网络、对网络进行评估;     (2) 调用forward/backward迭代优化和更新参数;     (3) 定期评估测试网络;•Solver的每一次迭代执行:      (1) 调用网络forward计算输出和loss;      (2) 调用网络backward计算梯度;      (3) 按照solver方法,采用渐变进行参数更新;      (4) 按照学习率、历史和方法更新solver状态。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息