weka[7] - Adaboost
2014-06-25 19:51
471 查看
前面已经分析完bagging,当然不得不提boosting了。boosting方法中名气最大的要数Adaboost了。
我记得以前看别人博客的时候,有个很形象的比喻,来说明adaboost如何工作的。
adaboost的训练过程,就好比小朋友读文章。同一篇文章,每次阅读的时候,读错的汉字,多读几次(加重weight),然后下一轮读的时候,自然而然能更好地区别。最后,把所有读过的记忆合起来,就能很好地阅读这篇陌生的文章了。
adaboost一般使用的base learner叫做decision stump。它是一棵单层的决策树,weka里也有实现。由于这个算法本身很简单,但是实现因为要考虑各种情况,代码比较长,所以这里不做解析了。
adaboost的优点就是泛化能力强,这个是因为boosting的误差理论上有一个upper bound,跟base learner的个数和base learner的准确度相关。
adaboost的缺点也很明显,那就是对outlier 太敏感。这个也很好理解,因为权重不断加深嘛(这点也可以利用起来做outlier detection)
基本的adaboost是一个二分类器。也有一些改造后的adaboost(比如M1,MH)可用于多分类器。
下面进入正题,来看看weka中的adaboost是如何实现的。
constructor:
buildClassifier:
buildClassifierUsingResampling:
构造decision stump的时候,因为要计算该decision stump的错误率(更新下一轮的权重),所以必须保证error>0。另一方,构造这个分类器的时候,样本要先经过放回抽样。
然后得到error后,更新样本权重,继续下一轮循环。知道k次循环结束,最终的k个分类器存放在m_classifier里。
buildClassifierWithWeights:
最后说下,为什么weka里,叫做AdaboostM1。
M1其实就是base learner是多分类的情况,这里的decision stump是支持多分类的。
而MH其实就是把原来的类别转化成{+1,-1}^n的形式,相当于分成n个二分类器做boost。
我记得以前看别人博客的时候,有个很形象的比喻,来说明adaboost如何工作的。
adaboost的训练过程,就好比小朋友读文章。同一篇文章,每次阅读的时候,读错的汉字,多读几次(加重weight),然后下一轮读的时候,自然而然能更好地区别。最后,把所有读过的记忆合起来,就能很好地阅读这篇陌生的文章了。
adaboost一般使用的base learner叫做decision stump。它是一棵单层的决策树,weka里也有实现。由于这个算法本身很简单,但是实现因为要考虑各种情况,代码比较长,所以这里不做解析了。
adaboost的优点就是泛化能力强,这个是因为boosting的误差理论上有一个upper bound,跟base learner的个数和base learner的准确度相关。
adaboost的缺点也很明显,那就是对outlier 太敏感。这个也很好理解,因为权重不断加深嘛(这点也可以利用起来做outlier detection)
基本的adaboost是一个二分类器。也有一些改造后的adaboost(比如M1,MH)可用于多分类器。
下面进入正题,来看看weka中的adaboost是如何实现的。
constructor:
public AdaBoostM1() { m_Classifier = new weka.classifiers.trees.DecisionStump(); }这里可以看到weka中adaboost的base learner是decision stump
buildClassifier:
public void buildClassifier(Instances data) throws Exception { super.buildClassifier(data); // can classifier handle the data? getCapabilities().testWithFail(data); // remove instances with missing class data = new Instances(data); data.deleteWithMissingClass(); // only class? -> build ZeroR model if (data.numAttributes() == 1) { System.err.println( "Cannot build model (only class attribute present in data!), " + "using ZeroR model instead!"); m_ZeroR = new weka.classifiers.rules.ZeroR(); m_ZeroR.buildClassifier(data); return; } else { m_ZeroR = null; } m_NumClasses = data.numClasses(); if ((!m_UseResampling) && (m_Classifier instanceof WeightedInstancesHandler)) { buildClassifierWithWeights(data); } else { buildClassifierUsingResampling(data); } }这里主要就是2个函数, buildClassifierWithWeights 和 buildClassifierUsingResampling。
buildClassifierUsingResampling:
protected void buildClassifierUsingResampling(Instances data) throws Exception { Instances trainData, sample, training; double epsilon, reweight, sumProbs; Evaluation evaluation; int numInstances = data.numInstances(); Random randomInstance = new Random(m_Seed); int resamplingIterations = 0; // Initialize data m_Betas = new double [m_Classifiers.length]; m_NumIterationsPerformed = 0; // Create a copy of the data so that when the weights are diddled // with it doesn't mess up the weights for anyone else // copy data training = new Instances(data, 0, numInstances); sumProbs = training.sumOfWeights(); // weight 归一化 for (int i = 0; i < training.numInstances(); i++) { training.instance(i).setWeight(training.instance(i). weight() / sumProbs); } // Do boostrap iterations for (m_NumIterationsPerformed = 0; m_NumIterationsPerformed < m_Classifiers.length; m_NumIterationsPerformed++) { if (m_Debug) { System.err.println("Training classifier " + (m_NumIterationsPerformed + 1)); } // Select instances to train the classifier on if (m_WeightThreshold < 100) { trainData = selectWeightQuantile(training, (double)m_WeightThreshold / 100); } else { trainData = new Instances(training); } // Resample resamplingIterations = 0; double[] weights = new double[trainData.numInstances()]; for (int i = 0; i < weights.length; i++) { weights[i] = trainData.instance(i).weight(); } do { sample = trainData.resampleWithWeights(randomInstance, weights); // Build and evaluate classifier m_Classifiers[m_NumIterationsPerformed].buildClassifier(sample); evaluation = new Evaluation(data); evaluation.evaluateModel(m_Classifiers[m_NumIterationsPerformed], training); epsilon = evaluation.errorRate(); resamplingIterations++; } while (Utils.eq(epsilon, 0) && (resamplingIterations < MAX_NUM_RESAMPLING_ITERATIONS)); // Stop if error too big or 0 if (Utils.grOrEq(epsilon, 0.5) || Utils.eq(epsilon, 0)) { if (m_NumIterationsPerformed == 0) { m_NumIterationsPerformed = 1; // If we're the first we have to to use it } break; } // Determine the weight to assign to this model m_Betas[m_NumIterationsPerformed] = Math.log((1 - epsilon) / epsilon); reweight = (1 - epsilon) / epsilon; if (m_Debug) { System.err.println("\terror rate = " + epsilon +" beta = " + m_Betas[m_NumIterationsPerformed]); } // Update instance weights setWeights(training, reweight); } }这是使用了resampling的算法。先对原来的数据的weights进行归一化。然后开始循环构造k个decision stump。
构造decision stump的时候,因为要计算该decision stump的错误率(更新下一轮的权重),所以必须保证error>0。另一方,构造这个分类器的时候,样本要先经过放回抽样。
然后得到error后,更新样本权重,继续下一轮循环。知道k次循环结束,最终的k个分类器存放在m_classifier里。
buildClassifierWithWeights:
protected void buildClassifierWithWeights(Instances data) throws Exception { Instances trainData, training; double epsilon, reweight; Evaluation evaluation; int numInstances = data.numInstances(); Random randomInstance = new Random(m_Seed); // Initialize data m_Betas = new double [m_Classifiers.length]; m_NumIterationsPerformed = 0; // Create a copy of the data so that when the weights are diddled // with it doesn't mess up the weights for anyone else training = new Instances(data, 0, numInstances); // Do boostrap iterations for (m_NumIterationsPerformed = 0; m_NumIterationsPerformed < m_Classifiers.length; m_NumIterationsPerformed++) { if (m_Debug) { System.err.println("Training classifier " + (m_NumIterationsPerformed + 1)); } // Select instances to train the classifier on if (m_WeightThreshold < 100) { trainData = selectWeightQuantile(training, (double)m_WeightThreshold / 100); } else { trainData = new Instances(training, 0, numInstances); } // Build the classifier if (m_Classifiers[m_NumIterationsPerformed] instanceof Randomizable) ((Randomizable) m_Classifiers[m_NumIterationsPerformed]).setSeed(randomInstance.nextInt()); m_Classifiers[m_NumIterationsPerformed].buildClassifier(trainData); // Evaluate the classifier evaluation = new Evaluation(data); evaluation.evaluateModel(m_Classifiers[m_NumIterationsPerformed], training); epsilon = evaluation.errorRate(); // Stop if error too small or error too big and ignore this model if (Utils.grOrEq(epsilon, 0.5) || Utils.eq(epsilon, 0)) { if (m_NumIterationsPerformed == 0) { m_NumIterationsPerformed = 1; // If we're the first we have to to use it } break; } // Determine the weight to assign to this model m_Betas[m_NumIterationsPerformed] = Math.log((1 - epsilon) / epsilon); reweight = (1 - epsilon) / epsilon; if (m_Debug) { System.err.println("\terror rate = " + epsilon +" beta = " + m_Betas[m_NumIterationsPerformed]); } // Update instance weights setWeights(training, reweight); } }这个基本和上面的一样!其实两者的区别就是,前者的训练样本是resample来的,后者的样本就是原数据集。
最后说下,为什么weka里,叫做AdaboostM1。
M1其实就是base learner是多分类的情况,这里的decision stump是支持多分类的。
而MH其实就是把原来的类别转化成{+1,-1}^n的形式,相当于分成n个二分类器做boost。
相关文章推荐
- 第二十篇:如何用Adaboost检测物体
- traincascade与AdaBoost的opencv实现框架
- Python Adaboost 实现MNIST 分类
- Boosting学习(三)—Adaboost原理白话
- Adaboost原理、算法以及应用
- adaboost learing
- opencv haar+adaboost使用心得
- AdaBoost原理,算法实现
- OpenCV编程->haar+adaboost识别源码
- Adaboost原理简析
- OpenCV3 SVM ANN Adaboost KNN 随机森林等机器学习方法对OCR分类
- 机器学习实战笔记7(Adaboost)
- 『机器学习算法』集成学习——AdaBoost
- OpenCV中的Haar+Adaboost(六):minHitRate与maxFalseAlarm
- 【机器学习笔记之四】Adaboost 算法
- Adaboost人脸检测介绍(都是大白话)
- 目标检测之人头检测(HaarLike Adaboost)---高密度环境下行人检测和统计
- 随机森林分类和adaboost分类方法的异同之处
- AdaBoost分类