Caffe Loss 层 - SigmoidCrossEntropyLoss 推导与Python实现
2018-02-05 13:21
453 查看
Caffe Loss 层 - SigmoidCrossEntropyLoss 推导与Python实现
[原文 - Caffe custom sigmoid cross entropy loss layer].很清晰的一篇介绍,学习下.
1. Sigmoid Cross Entropy Loss 推导
Sigmoid Cross Entropy Loss 定义形式:L=tln(P)+(1−t)ln(1−P)L=tln(P)+(1−t)ln(1−P)
其中,
tt - target 或 label;
PP - Sigmoid Score,P=11+e−xP=11+e−x
则有:
L=tln(11+e−x)+(1−t)ln(1−11+e−x)L=tln(11+e−x)+(1−t)ln(1−11+e−x)
公式推导有:
L=tln(11+e−x)+(1−t)ln(e−x1+e−x)L=tln(11+e−x)+(1−t)ln(e−x1+e−x)
L=tln(11+e−x)+ln(e−x1+e−x)−tln(e−x1+e−x)L=tln(11+e−x)+ln(e−x1+e−x)−tln(e−x1+e−x)
L=t[ln1−ln(1+e−x)]+[ln(e−1)−ln(1+e−x)]−t[ln(e−x)−ln(1+e−x)]L=t[ln1−ln(1+e−x)]+[ln(e−1)−ln(1+e−x)]−t[ln(e−x)−ln(1+e−x)]
L=[−tln(1+e−x)]+ln(e−x)−ln(1+e−x)−tln(e−x)+[tln(1+e−x)]L=[−tln(1+e−x)]+ln(e−x)−ln(1+e−x)−tln(e−x)+[tln(1+e−x)]
合并相关项:
L=ln(e−x)−ln(1+e−x)−tln(e−x)L=ln(e−x)−ln(1+e−x)−tln(e−x)
L=−xln(e)−ln(1+e−x)+txln(e)L=−xln(e)−ln(1+e−x)+txln(e)
L=−x−ln(1+e−x)+xtL=−x−ln(1+e−x)+xt
即:
L=xt−x−ln(1+e−x)L=xt−x−ln(1+e−x) <1>
e−xe−x(左) 和 exex(右) 的函数特点:
e−xe−x 随着 xx 值的增加而减小,当 xx 值为较大的负值时,e−xe−x 值变得非常大,很容易引起溢出(overflow). 也就是说,函数需要避免出现这种数据类型.
因此,为了避免溢出,对损失函数 LL 进行改动. 即,当 x<0x<0 时,采用 exex 进行修改损失函数:
原损失函数: L=xt−x−ln(1+e−x)L=xt−x−ln(1+e−x) <1>
有: L=xt−x+ln(11+e−x)L=xt−x+ln(11+e−x)
最后一项乘以 exex:
L=xt−x+ln(1∗ex(1+e−x)∗ex)L=xt−x+ln(1∗ex(1+e−x)∗ex)
L=xt−x+ln(ex1+ex)L=xt−x+ln(ex1+ex)
L=xt−x+[ln(ex)−ln(1+ex)]L=xt−x+[ln(ex)−ln(1+ex)]
L=xt−x+xlne−ln(1+ex)L=xt−x+xlne−ln(1+ex)
有:
L=xt−ln(1+ex)L=xt−ln(1+ex) <2>
根据 <1> 和 <2>,可以得到最终的损失函数:
L=xt−x−ln(1+e−x),(x>0)L=xt−x−ln(1+e−x),(x>0)
L=xt−0−ln(1+ex),(x<0)L=xt−0−ln(1+ex),(x<0)
合二为一,有:
L=xt−max(x,0)−ln(1+e−|x|),for all xL=xt−max(x,0)−ln(1+e−|x|),for all x
2. Sigmoid Cross Entropy Loss 求导计算
当 x>0x>0 时,L=xt−x−ln(1+e−x)L=xt−x−ln(1+e−x),有:
∂L∂x=∂(xt−x−ln(1+e−x))∂x∂L∂x=∂(xt−x−ln(1+e−x))∂x
∂L∂x=∂xt∂x−∂x∂x−∂(ln(1+e−x))∂x∂L∂x=∂xt∂x−∂x∂x−∂(ln(1+e−x))∂x
∂L∂x=t−1−11+e−x∗∂(1+e−x)∂x∂L∂x=t−1−11+e−x∗∂(1+e−x)∂x
∂L∂x=t−1−11+e−x∗∂(e−x)∂x∂L∂x=t−1−11+e−x∗∂(e−x)∂x
∂L∂x=t−1+e−x1+e−x∂L∂x=t−1+e−x1+e−x
有:
∂L∂x=t−11+e−x∂L∂x=t−11+e−x
第二项为 Sigmoid 函数P=11+e−xP=11+e−x,故,
∂L∂x=t−P∂L∂x=t−P
当 x<0x<0 时,L=xt−ln(1+ex)L=xt−ln(1+ex),
∂L∂x=∂(xt−ln(1+ex))∂x∂L∂x=∂(xt−ln(1+ex))∂x
∂L∂x=∂xt∂x−∂(ln(1+ex))∂x∂L∂x=∂xt∂x−∂(ln(1+ex))∂x
∂L∂x=t−11+ex∗∂(ex)∂x∂L∂x=t−11+ex∗∂(ex)∂x
∂L∂x=t−ex1+ex∂L∂x=t−ex1+ex
∂L∂x=t−ex∗e−x(1+ex)(e−x)∂L∂x=t−ex∗e−x(1+ex)(e−x)
∂L∂x=t−11+e−x∂L∂x=t−11+e−x
第二项为 Sigmoid 函数P=11+e−xP=11+e−x,故,
∂L∂x=t−P∂L∂x=t−P
可以看出,对于 x>0x>0 和 x<0x<0,其求导的结果是一样的,都是 target 值与 Sigmoid 值的差值.
3. 基于 Python 定制 caffe loss layer
Caffe 官方给出了基于 Python 定制 EuclideanLossLayer 的 Demo.这里,根据上面的公式推导,创建基于 Python 的 Caffe SigmoidCrossEntropyLossLayer.
Caffe 自带的是 C++ 实现 - SigmoidCrossEntropyLossLayer,可见 Caffe Loss层 - SigmoidCrossEntropyLossLayer.
假设 Labels∈{0,1}Labels∈{0,1}.
3.1 SigmoidCrossEntropyLossLayer 实现
import caffe import scipy class CustomSigmoidCrossEntropyLossLayer(caffe.Layer): def setup(self, bottom, top): # check for all inputs if len(bottom) != 2: raise Exception("Need two inputs (scores and labels) to compute sigmoid crossentropy loss.") def reshape(self, bottom, top): # check input dimensions match between the scores and labels if bottom[0].count != bottom[1].count: raise Exception("Inputs must have the same dimension.") # difference would be the same shape as any input self.diff = np.zeros_like(bottom[0].data, dtype=np.float32) # layer output would be an averaged scalar loss top[0].reshape(1) def forward(self, bottom, top): score=bottom[0].data label=bottom[1].data first_term=np.maximum(score,0) second_term=-1*score*label third_term=np.log(1+np.exp(-1*np.absolute(score))) top[0].data[...]=np.sum(first_term+second_term+third_term) sig=scipy.special.expit(score) self.diff=(sig-label) if np.isnan(top[0].data): exit() def backward(self, top, propagate_down, bottom): bottom[0].diff[...]=self.diff
3.2 prototxt 中定义
layer { type: 'Python' name: 'loss' top: 'loss_opt' bottom: 'score' bottom: 'label' python_param { # the module name -- usually the filename -- that needs to be in $PYTHONPATH module: 'loss_layers' # the layer name -- the class name in the module layer: 'CustomSigmoidCrossEntropyLossLayer' } include { phase: TRAIN } # set loss weight so Caffe knows this is a loss layer. # since PythonLayer inherits directly from Layer, this isn't automatically # known to Caffe loss_weight: 1 }
4. Related
[1] - Caffe Loss层 - SigmoidCrossEntropyLossLayer相关文章推荐
- caffe SigmoidCrossEntropyLossLayer 理论代码学习
- Caffe—SigmoidCrossEntropyLossLayer
- caffe Sigmoid cross entropy loss 交叉熵损失函数
- 学习Caffe(五)浅析softmax cross entropy loss与sigmoid cross entropy loss
- 详解softmax与softmax loss的前后向推导及caffe源码实现
- caffe Sigmoid cross entropy loss 交叉熵损失函数
- caffe SigmoidCrossEntropyLossLayer 理论代码学习
- caffe SigmoidCrossEntropyLossLayer 理论代码学习
- BP算法的推导及其在Caffe中的实现
- Caffe Python接口绘制 train accuracy, train loss, test loss curves
- 使用caffe的python接口实现内部参数可视化
- Caffe学习:绘制loss和accuracy曲线(使用caffe的python接口)
- caffe的python接口学习(7):绘制loss和accuracy曲线
- Caffe中增加新的layer以及Caffe中triplet loss layer的实现
- SVM原理介绍与Python实现(二):SVM的推导过程
- python数据输入caffe实现回归
- GAN原理解析,公式推导与python实现
- 最速下降法/梯度下降法公式推导与python实现
- 支持向量机 SVM 算法推导优缺点 代码实现 in Python
- 牛顿法公式推导与python实现