您的位置:首页 > 其它

SSD的源码解读——MultiBoxLoss函数定义

2017-11-17 22:48 369 查看

SSD如何计算location loss function

SSD在计算损失函数的时候,用到了两项的加和,类别的confidence和对default box location的回归分别计算的损失值。



N是匹配的default boxes的个数,x表示匹配了的框是否属于类别p,取值{0,1};l是预测框predicted box,g是真实值ground truth box;c是指所框选目标属于类别p的置信度confidence。

只对Lloc位置的损失函数查看SSD的caffe源码怎么做的:

caffe源码

在caffe-ssd/jobs/VGGNet/VOC0712/SSD_300x300/train.prototxt中,查询loss,直接定位到了MultiBoxLoss层,里面包含了多个bottom layer,在此文件中向上查找可以看到前三个bottom层是由Concat层将多个层的数据组合到一起形成的数据层。这种多层结构选取default box的方式是SSD的特点所在,文中有一些引用来表明这一想法来源。



然后在src/caffe/layer找到相应的cpp—multibox_loss_layer.cpp,里面的函数LayerSerUp()是读取prototxt中该层的参数,Forward_cpu()函数是对这一层的数据处理过程,bottom[0]和bottom[3]分别对应loc layer数据和label 数据。

然后调用了函数EncodeLocPrediction()来计算,找到源码位置在bbox_util.hpp(include/caffe/util/)中是这样定义该函数的:

// Encode the localization prediction and ground truth for each matched prior.
//    all_loc_preds: stores the location prediction, where each item contains
//      location prediction for an image.
//    all_gt_bboxes: stores ground truth bboxes for the batch.
//    all_match_indices: stores mapping between predictions and ground truth.
//    prior_bboxes: stores all the prior bboxes in the format of NormalizedBBox.
//    prior_variances: stores all the variances needed by prior bboxes.
//    multibox_loss_param: stores the parameters for MultiBoxLossLayer.
//    loc_pred_data: stores the location prediction results.
//    loc_gt_data: stores the encoded location ground truth.
template <typename Dtype>
void EncodeLocPrediction(const vector<LabelBBox>& all_loc_preds,
const map<int, vector<NormalizedBBox> >& all_gt_bboxes,
const vector<map<int, vector<int> > >& all_match_indices,
const vector<NormalizedBBox>& prior_bboxes,
const vector<vector<float> >& prior_variances,
const MultiBoxLossParameter& multibox_loss_param,
Dtype* loc_pred_data, Dtype* loc_gt_data);


可见SSD在实现的时候,是将所有的符合“匹配策略”的default box和 ground truth集合拿进来进行计算的。据此可以找到该函数调用的时候的参数来源,特别是FindMatches()是用来查找符合条件的集合,同样在bbox_util.hpp中,函数定义为:

// Find matches between prediction bboxes and ground truth bboxes.
//    all_loc_preds: stores the location prediction, where each item contains
//      location prediction for an image.
//    all_gt_bboxes: stores ground truth bboxes for the batch.
//    prior_bboxes: stores all the prior bboxes in the format of NormalizedBBox.
//    prior_variances: stores all the variances needed by prior bboxes.
//    multibox_loss_param: stores the parameters for MultiBoxLossLayer.
//    all_match_overlaps: stores jaccard overlaps between predictions and gt.
//    all_match_indices: stores mapping between predictions and ground truth.
void FindMatches(const vector<LabelBBox>& all_loc_preds,
const map<int, vector<NormalizedBBox> >& all_gt_bboxes,
const vector<NormalizedBBox>& prior_bboxes,
const vector<vector<float> >& prior_variances,
const MultiBoxLossParameter& multibox_loss_param,
vector<map<int, vector<float> > >* all_match_overlaps,
vector<map<int, vector<int> > >* all_match_indices);


jaccard overlap/ComputeLocLoss

在FindMatches可以看到jaccard overlap的处理,顺便看看源码怎么处理的overlap:函数调用了MatchBBox()(行584,bbox_util.cpp),然后又调用了JaccardOverlap()函数,它计算重叠区域时调用了IntersectBBox()。数据增强处理时SSD也会用到这一函数,不过还需要后续的判断。

在multibos_loss_layer.cpp后面调用了MineHardExamples()用来选择正负样本达到1:3的效果,里面用到了jaccardOverlapLabel。并且在这里计算了confidence,函数ComputerConfLossGpu()(行900,bbox_util.cpp)。并且在这里面也计算了localization losses,有函数ComputeLocLoss()(行919,bbox_util.cpp),查看其头文件为:

// Compute the localization loss per matched prior.
//    loc_pred: stores the location prediction results.
//    loc_gt: stores the encoded location ground truth.
//    all_match_indices: stores mapping between predictions and ground truth.
//    num: number of images in the batch.
//    num_priors: total number of priors.
//    loc_loss_type: type of localization loss, Smooth_L1 or L2.
//    all_loc_loss: stores the localization loss for all priors in a batch.
template <typename Dtype>
void ComputeLocLoss(const Blob<Dtype>& loc_pred, const Blob<Dtype>& loc_gt,
const vector<map<int, vector<int> > >& all_match_indices,
const int num, const int num_priors, const LocLossType loc_loss_type,
vector<vector<float> >* all_loc_loss);


在multibos_loss_layer.cpp又紧接着调用了EncodeLocPrediction()函数。

然后创建了loc_loss_layer进行forward计算,其中MultiBoxLossLayer继承了LossLayer,而LossLayer又继承了Layer,Layer定义了forward和backward函数,并调用了Forward_cpu和Forward_gpu虚函数,backward也相同。

conf_loss_layer有相似的结构。

由此可知,在原文计算L(loc)时的X(ij)是只选用了符合jaccard overlap限制要求的default box和ground boxes构建损失函数的,损失函数如下。

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息