SSD的源码解读——MultiBoxLoss函数定义
2017-11-17 22:48
369 查看
SSD如何计算location loss function
SSD在计算损失函数的时候,用到了两项的加和,类别的confidence和对default box location的回归分别计算的损失值。N是匹配的default boxes的个数,x表示匹配了的框是否属于类别p,取值{0,1};l是预测框predicted box,g是真实值ground truth box;c是指所框选目标属于类别p的置信度confidence。
只对Lloc位置的损失函数查看SSD的caffe源码怎么做的:
caffe源码
在caffe-ssd/jobs/VGGNet/VOC0712/SSD_300x300/train.prototxt中,查询loss,直接定位到了MultiBoxLoss层,里面包含了多个bottom layer,在此文件中向上查找可以看到前三个bottom层是由Concat层将多个层的数据组合到一起形成的数据层。这种多层结构选取default box的方式是SSD的特点所在,文中有一些引用来表明这一想法来源。然后在src/caffe/layer找到相应的cpp—multibox_loss_layer.cpp,里面的函数LayerSerUp()是读取prototxt中该层的参数,Forward_cpu()函数是对这一层的数据处理过程,bottom[0]和bottom[3]分别对应loc layer数据和label 数据。
然后调用了函数EncodeLocPrediction()来计算,找到源码位置在bbox_util.hpp(include/caffe/util/)中是这样定义该函数的:
// Encode the localization prediction and ground truth for each matched prior. // all_loc_preds: stores the location prediction, where each item contains // location prediction for an image. // all_gt_bboxes: stores ground truth bboxes for the batch. // all_match_indices: stores mapping between predictions and ground truth. // prior_bboxes: stores all the prior bboxes in the format of NormalizedBBox. // prior_variances: stores all the variances needed by prior bboxes. // multibox_loss_param: stores the parameters for MultiBoxLossLayer. // loc_pred_data: stores the location prediction results. // loc_gt_data: stores the encoded location ground truth. template <typename Dtype> void EncodeLocPrediction(const vector<LabelBBox>& all_loc_preds, const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const vector<map<int, vector<int> > >& all_match_indices, const vector<NormalizedBBox>& prior_bboxes, const vector<vector<float> >& prior_variances, const MultiBoxLossParameter& multibox_loss_param, Dtype* loc_pred_data, Dtype* loc_gt_data);
可见SSD在实现的时候,是将所有的符合“匹配策略”的default box和 ground truth集合拿进来进行计算的。据此可以找到该函数调用的时候的参数来源,特别是FindMatches()是用来查找符合条件的集合,同样在bbox_util.hpp中,函数定义为:
// Find matches between prediction bboxes and ground truth bboxes. // all_loc_preds: stores the location prediction, where each item contains // location prediction for an image. // all_gt_bboxes: stores ground truth bboxes for the batch. // prior_bboxes: stores all the prior bboxes in the format of NormalizedBBox. // prior_variances: stores all the variances needed by prior bboxes. // multibox_loss_param: stores the parameters for MultiBoxLossLayer. // all_match_overlaps: stores jaccard overlaps between predictions and gt. // all_match_indices: stores mapping between predictions and ground truth. void FindMatches(const vector<LabelBBox>& all_loc_preds, const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const vector<NormalizedBBox>& prior_bboxes, const vector<vector<float> >& prior_variances, const MultiBoxLossParameter& multibox_loss_param, vector<map<int, vector<float> > >* all_match_overlaps, vector<map<int, vector<int> > >* all_match_indices);
jaccard overlap/ComputeLocLoss
在FindMatches可以看到jaccard overlap的处理,顺便看看源码怎么处理的overlap:函数调用了MatchBBox()(行584,bbox_util.cpp),然后又调用了JaccardOverlap()函数,它计算重叠区域时调用了IntersectBBox()。数据增强处理时SSD也会用到这一函数,不过还需要后续的判断。
在multibos_loss_layer.cpp后面调用了MineHardExamples()用来选择正负样本达到1:3的效果,里面用到了jaccardOverlapLabel。并且在这里计算了confidence,函数ComputerConfLossGpu()(行900,bbox_util.cpp)。并且在这里面也计算了localization losses,有函数ComputeLocLoss()(行919,bbox_util.cpp),查看其头文件为:
// Compute the localization loss per matched prior. // loc_pred: stores the location prediction results. // loc_gt: stores the encoded location ground truth. // all_match_indices: stores mapping between predictions and ground truth. // num: number of images in the batch. // num_priors: total number of priors. // loc_loss_type: type of localization loss, Smooth_L1 or L2. // all_loc_loss: stores the localization loss for all priors in a batch. template <typename Dtype> void ComputeLocLoss(const Blob<Dtype>& loc_pred, const Blob<Dtype>& loc_gt, const vector<map<int, vector<int> > >& all_match_indices, const int num, const int num_priors, const LocLossType loc_loss_type, vector<vector<float> >* all_loc_loss);
在multibos_loss_layer.cpp又紧接着调用了EncodeLocPrediction()函数。
然后创建了loc_loss_layer进行forward计算,其中MultiBoxLossLayer继承了LossLayer,而LossLayer又继承了Layer,Layer定义了forward和backward函数,并调用了Forward_cpu和Forward_gpu虚函数,backward也相同。
conf_loss_layer有相似的结构。
由此可知,在原文计算L(loc)时的X(ij)是只选用了符合jaccard overlap限制要求的default box和ground boxes构建损失函数的,损失函数如下。
相关文章推荐
- SSD的caffe源码解读 -- 数据增强
- 目标检测算法SSD源码解读~~~~~~~~~~ssd_pascal.py
- 【转】SSD的caffe源码解读 -- 数据增强
- SSD源码解读1~~~~~~~~~~ssd_pascal.py
- SSD源码解读之ssd_pascal.py
- Zepto.js 源码解读
- Retrofit 源码解读之离线缓存策略的实现
- java源码解读之String
- jQuery选择器源码解读(三):tokenize方法
- Android下拉刷新PullToRefresh源码解读
- jquery2.0.3动画(animate)源码解读与javascript基本知识学习一
- Docker网络详解及pipework源码解读与实践
- Bootstrap源码解读(第六弹:导航)
- JDK中多线程的基础篇JDK的源码解读配合大神的一起看,秒懂。
- Bootstrap源码解读排版(1)
- Alamofire源码解读系列(三)之通知处理(Notification)
- jquery插件select2源码解读(一) 概述
- 源码级别解读 mybatis 插件
- 解读和分析Linux核心源码的两种方法
- HOG源码解读