您的位置:首页 > 其它

GBDT训练分类器时,残差是如何计算的?

2016-09-01 14:05 435 查看
大家都知道,对于回归任务,GBDT的loss=(y-pred)^2,因此残差residual=2*(y-pred)很容易理解。

那么,GBDT做分类任务时,残差是怎样的呢???

Gradient
Boosting attempts to solve this minimization problem numerically via steepest descent,

The steepest
descent direction is the negative gradient of the loss function evaluated at the current model 

 ,

which can
be calculated for any differentiable loss function。

The
algorithms for regression and classification only differ in the concrete loss function used.

下面以分类的deviance为例:http://scikit-learn.org/stable/modules/ensemble.html#loss-functions

Classification

Binomial deviance (
'deviance'
): The
negative binomial log-likelihood loss function for binary classification (provides probability estimates).

The initial model is given by the log odds-ratio.

0)GradientBoostingClassifier的_init_中:

1415,loss='deviance'

1423,super(GradientBoostingClassifier,
self).__init__(loss=loss,
...)

1)fit(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L930)有个关键代码:



2)_fit_stages(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L1035)

1048,loss_
=
self.loss_



先看loss_(),后看_fit_stage()。

3)先看loss_()。
https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L651
LOSS_FUNCTIONS
= {'ls': LeastSquaresError,
 'lad': LeastAbsoluteError,
 'huber': HuberLossFunction,
 'quantile': QuantileLossFunction,
 'deviance':
None,
# for both, multinomial and binomial
 'exponential': ExponentialLoss,
 }
https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L465(BinomialDeviance)



4)后看_fit_stage()(https://github.com/scikit-learn/scikit-learn/blob/51a765a/sklearn/ensemble/gradient_boosting.py#L747)



从763行我们知道,即便是分类器,内部的tree也是Regression
tree。

另外,759行是我们最关注的residual计算方式,再回到L491有下面代码:


http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.special.expit.html
The expit function, also
known as the logistic function, is defined as expit(x) = 1/(1+exp(-x)). It is the inverse of the logit function.

所以,我们的最终结论是:

1)分类同样适用回归树。

2)对于二元分类,y只能是1或者0,残差通过y-pred计算,其中pred实际上是logistic
function计算出来的一个概率!!!

3)对于N元分类,只能转换成N个二元分类(其中y为0或1)。这个结论从下面的描述中可以猜到,没有详细看代码。

Note
 

Classification with more than 2 classes requires the induction of 
n_classes
 regression
trees at each at each iteration,

thus, the total number of induced trees equals 
n_classes * n_estimators
.

For datasets with a large number of classes we strongly recommend to use 
RandomForestClassifier
 

as an alternative to 
GradientBoostingClassifier
 .
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息