GBDT训练分类器时,残差是如何计算的?
2016-09-01 14:05
435 查看
大家都知道,对于回归任务,GBDT的loss=(y-pred)^2,因此残差residual=2*(y-pred)很容易理解。
那么,GBDT做分类任务时,残差是怎样的呢???
Gradient
Boosting attempts to solve this minimization problem numerically via steepest descent,
The steepest
descent direction is the negative gradient of the loss function evaluated at the current model
,
which can
be calculated for any differentiable loss function。
The
algorithms for regression and classification only differ in the concrete loss function used.
下面以分类的deviance为例:http://scikit-learn.org/stable/modules/ensemble.html#loss-functions
Classification
Binomial deviance (
negative binomial log-likelihood loss function for binary classification (provides probability estimates).
The initial model is given by the log odds-ratio.
0)GradientBoostingClassifier的_init_中:
1415,loss='deviance'
1423,super(GradientBoostingClassifier,
self).__init__(loss=loss,
...)
1)fit(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L930)有个关键代码:
2)_fit_stages(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L1035)
1048,loss_
=
self.loss_
先看loss_(),后看_fit_stage()。
3)先看loss_()。
https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L651
https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L465(BinomialDeviance)
4)后看_fit_stage()(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L747)
从763行我们知道,即便是分类器,内部的tree也是Regression
tree。
另外,759行是我们最关注的residual计算方式,再回到L491有下面代码:
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.special.expit.html
The expit function, also
known as the logistic function, is defined as expit(x) = 1/(1+exp(-x)). It is the inverse of the logit function.
所以,我们的最终结论是:
1)分类同样适用回归树。
2)对于二元分类,y只能是1或者0,残差通过y-pred计算,其中pred实际上是logistic
function计算出来的一个概率!!!
3)对于N元分类,只能转换成N个二元分类(其中y为0或1)。这个结论从下面的描述中可以猜到,没有详细看代码。
Note
Classification with more than 2 classes requires the induction of
trees at each at each iteration,
thus, the total number of induced trees equals
For datasets with a large number of classes we strongly recommend to use
as an alternative to
那么,GBDT做分类任务时,残差是怎样的呢???
Gradient
Boosting attempts to solve this minimization problem numerically via steepest descent,
The steepest
descent direction is the negative gradient of the loss function evaluated at the current model
,
which can
be calculated for any differentiable loss function。
The
algorithms for regression and classification only differ in the concrete loss function used.
下面以分类的deviance为例:http://scikit-learn.org/stable/modules/ensemble.html#loss-functions
Classification
Binomial deviance (
'deviance'): The
negative binomial log-likelihood loss function for binary classification (provides probability estimates).
The initial model is given by the log odds-ratio.
0)GradientBoostingClassifier的_init_中:
1415,loss='deviance'
1423,super(GradientBoostingClassifier,
self).__init__(loss=loss,
...)
1)fit(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L930)有个关键代码:
2)_fit_stages(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L1035)
1048,loss_
=
self.loss_
先看loss_(),后看_fit_stage()。
3)先看loss_()。
https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L651
LOSS_FUNCTIONS = {'ls': LeastSquaresError, | |
'lad': LeastAbsoluteError, | |
'huber': HuberLossFunction, | |
'quantile': QuantileLossFunction, | |
'deviance': None, # for both, multinomial and binomial | |
'exponential': ExponentialLoss, | |
} |
4)后看_fit_stage()(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L747)
从763行我们知道,即便是分类器,内部的tree也是Regression
tree。
另外,759行是我们最关注的residual计算方式,再回到L491有下面代码:
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.special.expit.html
The expit function, also
known as the logistic function, is defined as expit(x) = 1/(1+exp(-x)). It is the inverse of the logit function.
所以,我们的最终结论是:
1)分类同样适用回归树。
2)对于二元分类,y只能是1或者0,残差通过y-pred计算,其中pred实际上是logistic
function计算出来的一个概率!!!
3)对于N元分类,只能转换成N个二元分类(其中y为0或1)。这个结论从下面的描述中可以猜到,没有详细看代码。
Note
Classification with more than 2 classes requires the induction of
n_classesregression
trees at each at each iteration,
thus, the total number of induced trees equals
n_classes * n_estimators.
For datasets with a large number of classes we strongly recommend to use
RandomForestClassifier
as an alternative to
GradientBoostingClassifier.
相关文章推荐
- 如何用OpenCV训练自己的分类器
- 如何用OpenCV训练自己的分类器[转]
- 如何利用OpenCV自带的haar training程序训练分类器
- 在opencv中,强分类器阈值是如何确定的?虚警率是怎么计算的?
- 如何用 opencv 训练自己的分类器
- 如何训练自己的分类器xml文件
- 如何用OpenCV训练自己的分类器
- 如何训练opencv hog.detectmultiscale 的分类器
- 如何用OpenCV训练自己的分类器
- 如何利用OpenCV自带的haar training程序训练分类器
- [转]如何用OpenCV训练自己的分类器
- 如何利用OpenCV自带的haar training程序训练分类器
- OpenCV 人脸检测自学(6)opencv_traincascade如何训练强弱分类器
- 如何利用OpenCV自带的haar training程序训练分类器
- 如何利用OpenCV自带的haar training程序训练分类器
- 如何用OpenCV训练自己的分类器
- 如何用OpenCV训练自己的分类器
- 如何用OpenCV训练自己的分类器
- 如何利用OpenCV自带的haar training程序训练分类器分类
- 分类器是如何做检测的?(2)——【续】检测中的LBP和HAAR特征计算过程