libsvm for MATLAB
2014-02-28 14:52
197 查看
转自:https://sites.google.com/site/kittipat/libsvm_matlab
Libsvm is a great tool for SVM as it is very easy to use and is documented well. The libsvm package webpage is maintained by Chih-Chung Chang and Chih-Jen
Lin of NTU. The webpage can be found here.
I made this tutorial as a reminder for myself when I need to use it again. All the credits go for the libsvm developers. Here is how you can cite
the libsvm.
How to install the libsvm for MATLAB on Unix machine
Linear-kernel SVM for binary classification
kernel SVM for binary classification
cross validation for C and Gamma
multi-class SVM: one-vs-rest (OVR)
More ready-to-use matlab example
Available matlab codes to download
need to install it using the command below:
The result shows:
The whole data set is plotted:
The clustering results might look like this:
The unfilled markers represent data instance from the train set. The filled markers represent data instance from the test set, and filled color represents the class label assigned by SVM whereas
the edge color represents the true (ground-truth) label. The marker size of the test set represents the probability that the sample instance is assigned with its corresponding class label; the bigger, the more confidence.
appeared below.
The complete code can be found here.
The resulting clusters are shown in the figure below.
In this example, we will use the option enforcing n-fold cross validation in svmtrain, which is simply put the '-v n' in the parameter section, where n denote n-fold cross validation. Here is
the example of using 3-fold cross validation:
In the example below, I will show the nested cross validation. First, we search for the optimal parameters (c and gamma) in the big scale, then the searching space is narrowed down until satisfied.
The results are compared with the first experiment which does not use the optimal parameters. The full code can be found here.
strategy is to do binary classification 1 pair at a time. Here we will use one-versus-rest approach. In fact, we can just use the original codes (svmtrain and svmpredict) from the libsvm package to do the job by making a "wrapper code" to call the original
code one pair at a time. The good news is that libsvm tutorial page provides a wrapper code to do so already. Yes, we will just use it properly.
Just download the demo code from the end of this URL,
which says
The codes ovrtrain and ovrpredict are the wrapper. You can also do the cross validation from the demo code below, where get_cv_ac is again the wrapper code.
[/code]
The full-implemented code can be found here.
Results show that
Big picture: In this scenario, I compiled an easy example to illustrate how to use svm in full
process. The code contains:
data generation
determining train and test data set
parameter selection using n-fold cross validation, both semi-manual and the automatic approach
train the svm model using one-versus-rest (OVR) approach
use the svm model to classify the test set in OVR mode
make confusion matrix to evaluate the results
show the results in an informative way
display the decision boundary on the feature space
Reporting a results using n-fold cross validation: In case you have only 1 data set (i.e., there is no explicit train or test set), n-fold cross validation is a conventional way to assess a classifier. The overall accuracy is obtained
by averaging the accuracy per each of the n-fold cross validation. The observations are separated into n folds equally, the code use n-1 folds to train the svm model which will be used to classify the remaining 1 fold according to standard OVR. The code can
be found here.
Using multiclass ovr-svm with kernel: So far I haven't shown the usage of ovr-svm with kernel specific ('-t x'). In fact, you can add the kernel to any ovr code, they will work. The complete code can be found here.
For parameter selection using cross validation, we use the code below to calculate the average accuracy cv. You can just add
Training: just add
Classification: the
However, I found that the code can be very slow in parameter selection routine when the number of class and the number of cross validation are big (e.g., Nclass = 10, Ncv=3). I think the slow part might be caused by
Complete example for classification using n-fold cross validation: This code works on the single data where the train and test set are combined within one single set. More details can be found here.
Complete example for classification using train and test data set separately: This code works on the data set where the train and test set are separated, that is, train the model using train set and use the model to classify the
test set. More details can be found here.
How to obtain the SVM weight vector w: Please see the example code and discussion from StackOverflow.
All the code can be found in the zip file here.
Libsvm is a great tool for SVM as it is very easy to use and is documented well. The libsvm package webpage is maintained by Chih-Chung Chang and Chih-Jen
Lin of NTU. The webpage can be found here.
I made this tutorial as a reminder for myself when I need to use it again. All the credits go for the libsvm developers. Here is how you can cite
the libsvm.
Content
In this short tutorial, the following topics will be discussed:How to install the libsvm for MATLAB on Unix machine
Linear-kernel SVM for binary classification
kernel SVM for binary classification
cross validation for C and Gamma
multi-class SVM: one-vs-rest (OVR)
More ready-to-use matlab example
Available matlab codes to download
Here is how to install the toolbox
Just read the readme file in the package. It's very easy. You can do it in both terminal and in MATLAB workspace. On Ubuntu machine, just to make sure you have gcc in your machine. If not, youneed to install it using the command below:
sudo apt-get install build-essential g++
Basic SVM: Linear-kernel SVM for binary classification
Below is the first code to run. The code is for binary classification and use the variable c = 1, gamma (g) = 0.07 and '-b 1' denotes the probability output.% This code just simply run the SVM on the example data set "heart_scale",
% which is scaled properly. The code divides the data into 2 parts
% train: 1 to 200
% test: 201:270
% Then plot the results vs their true class. In order to visualize the high
% dimensional data, we apply MDS to the 13D data and reduce the dimension
% to 2D
clear
clc
close all
% addpath to the libsvm toolbox
addpath('../libsvm-3.12/matlab');
% addpath to the data
dirData = '../libsvm-3.12';
addpath(dirData);
% read the data set
[heart_scale_label, heart_scale_inst] = libsvmread(fullfile(dirData,'heart_scale'));
[N D] = size(heart_scale_inst);
% Determine the train and test index
trainIndex = zeros(N,1); trainIndex(1:200) = 1;
testIndex = zeros(N,1); testIndex(201:N) = 1;
trainData = heart_scale_inst(trainIndex==1,:);
trainLabel = heart_scale_label(trainIndex==1,:);
testData = heart_scale_inst(testIndex==1,:);
testLabel = heart_scale_label(testIndex==1,:);
% Train the SVM
model = svmtrain(trainLabel, trainData, '-c 1 -g 0.07 -b 1');
% Use the SVM model to classify the data
[predict_label, accuracy, prob_values] = svmpredict(testLabel, testData, model, '-b 1'); % run the SVM model on the test data
% ================================
% ===== Showing the results ======
% ================================
% Assign color for each class
% colorList = generateColorList(2);
% This is my own way to assign the color...don't worry about it
colorList = prism(100);
% true (ground truth) class
trueClassIndex = zeros(N,1);
trueClassIndex(heart_scale_label==1) = 1;
trueClassIndex(heart_scale_label==-1) = 2;
colorTrueClass = colorList(trueClassIndex,:);
% result Class
resultClassIndex = zeros(length(predict_label),1);
resultClassIndex(predict_label==1) = 1;
resultClassIndex(predict_label==-1) = 2;
colorResultClass = colorList(resultClassIndex,:);
% Reduce the dimension from 13D to 2D
distanceMatrix = pdist(heart_scale_inst,'euclidean');
newCoor = mdscale(distanceMatrix,2);
% Plot the whole data set
x = newCoor(:,1);
y = newCoor(:,2);
patchSize = 30; %max(prob_values,[],2);
colorTrueClassPlot = colorTrueClass;
figure; scatter(x,y,patchSize,colorTrueClassPlot,'filled');
title('whole data set');
% Plot the test data
x = newCoor(testIndex==1,1);
y = newCoor(testIndex==1,2);
patchSize = 80*max(prob_values,[],2);
colorTrueClassPlot = colorTrueClass(testIndex==1,:);
figure; hold on;
scatter(x,y,2*patchSize,colorTrueClassPlot,'o','filled');
scatter(x,y,patchSize,colorResultClass,'o','filled');
% Plot the training set
x = newCoor(trainIndex==1,1);
y = newCoor(trainIndex==1,2);
patchSize = 30;
colorTrueClassPlot = colorTrueClass(trainIndex==1,:);
scatter(x,y,patchSize,colorTrueClassPlot,'o'); title('classification results');
The result shows:
optimization finished, #iter = 137
nu = 0.457422
obj = -76.730867, rho = 0.435233
nSV = 104, nBSV = 81
Total nSV = 104
Accuracy = 81.4286% (57/70) (classification)
The whole data set is plotted:
The clustering results might look like this:
The unfilled markers represent data instance from the train set. The filled markers represent data instance from the test set, and filled color represents the class label assigned by SVM whereas
the edge color represents the true (ground-truth) label. The marker size of the test set represents the probability that the sample instance is assigned with its corresponding class label; the bigger, the more confidence.
Kernel SVM for binary classification
Now let's apply some kernel to the SVM. We use almost the same code as before, the only exception is the train data set, trainData, is replaced by the kernelized version[(1:200)' trainData*trainData']and the test data, testData, is replaced by its kernelized version
[(1:70)' testData*trainData']as
appeared below.
% Train the SVM
model = svmtrain(trainLabel, [(1:200)' trainData*trainData'], '-c 1 -g 0.07 -b 1 -t 4');
% Use the SVM model to classify the data
[predict_label, accuracy, prob_values] = svmpredict(testLabel, [(1:70)' testData*trainData'], model, '-b 1');
% run the SVM model on the test data
The complete code can be found here.
The resulting clusters are shown in the figure below.
'Linear' kernel
optimization finished, #iter = 403796
nu = 0.335720
obj = -67.042781, rho = -1.252604
nSV = 74, nBSV = 60
Total nSV = 74
Accuracy = 85.7143% (60/70) (classification)
'polynomial' kernel
optimization finished, #iter = 102385
nu = 0.000001
obj = -0.000086, rho = -0.465342
nSV = 69, nBSV = 0
Total nSV = 69
Accuracy = 72.8571% (51/70) (classification)
'RBF' kernel
optimization finished, #iter = 372
nu = 0.890000
obj = -97.594730, rho = 0.194414
nSV = 200, nBSV = 90
Total nSV = 200
Accuracy = 57.1429% (40/70) (classification)
'Sigmoid' kernel
optimization finished, #iter = 90
nu = 0.870000
obj = -195.417169, rho = 0.999993
nSV = 174, nBSV = 174
Total nSV = 174
Accuracy = 60% (42/70) (classification)
'MLP' kernel
optimization finished, #iter = 1247
nu = 0.352616
obj = -68.842421, rho = -0.552693
nSV = 77, nBSV = 63
Total nSV = 77
Accuracy = 82.8571% (58/70) (classification)
Linear-kernel SVM: 85.7% accuracy | Polynomial-kernel SVM: 72.86% accuracy | RBF-kernel SVM: 57.14% accuracy |
Sigmoid-kernel SVM: 60% accuracy | MLP-kernel SVM: 82.86% accuracy |
Cross validation of C and Gamma
The option for svmtrainn-fold cross validation: n must >= 2
Usage: model = svmtrain(training_label_vector, training_instance_matrix, 'libsvm_options');
libsvm_options:
-s svm_type : set type of SVM (default 0)
0 -- C-SVC
1 -- nu-SVC
2 -- one-class SVM
3 -- epsilon-SVR
4 -- nu-SVR
-t kernel_type : set type of kernel function (default 2)
0 -- linear: u'*v
1 -- polynomial: (gamma*u'*v + coef0)^degree
2 -- radial basis function: exp(-gamma*|u-v|^2)
3 -- sigmoid: tanh(gamma*u'*v + coef0)
4 -- precomputed kernel (kernel values in training_instance_matrix)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n : n-fold cross validation mode
-q : quiet mode (no outputs)
In this example, we will use the option enforcing n-fold cross validation in svmtrain, which is simply put the '-v n' in the parameter section, where n denote n-fold cross validation. Here is
the example of using 3-fold cross validation:
param = ['-q -v 3 -c ', num2str(c), ' -g ', num2str(g)];
cv = svmtrain(trainLabel, trainData, param);
In the example below, I will show the nested cross validation. First, we search for the optimal parameters (c and gamma) in the big scale, then the searching space is narrowed down until satisfied.
The results are compared with the first experiment which does not use the optimal parameters. The full code can be found here.
Multi-class SVM
Naturally, SVM is a binary classification model, how can we use SVM in the multi-class scenario? In this example, we will show you how to do multi-class classification using libsvm. A simplestrategy is to do binary classification 1 pair at a time. Here we will use one-versus-rest approach. In fact, we can just use the original codes (svmtrain and svmpredict) from the libsvm package to do the job by making a "wrapper code" to call the original
code one pair at a time. The good news is that libsvm tutorial page provides a wrapper code to do so already. Yes, we will just use it properly.
Just download the demo code from the end of this URL,
which says
[code][trainY trainX] = libsvmread('./dna.scale'); [testY testX] = libsvmread('./dna.scale.t'); model = ovrtrain(trainY, trainX, '-c 8 -g 4'); [pred ac decv] = ovrpredict(testY, testX, model); fprintf('Accuracy = %g%%\n', ac * 100);
The codes ovrtrain and ovrpredict are the wrapper. You can also do the cross validation from the demo code below, where get_cv_ac is again the wrapper code.
[code]bestcv = 0; for log2c = -1:2:3, for log2g = -4:2:1, cmd = ['-q -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)]; cv = get_cv_ac(trainY, trainX, cmd, 3); if (cv >= bestcv), bestcv = cv; bestc = 2^log2c; bestg = 2^log2g; end fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv); end end
[/code]
The full-implemented code can be found here.
Results show that
More examples
You may find the following examples useful. Each code is built for some specific application, which might be useful to the reader to download and tweak just to save your developing time.Big picture: In this scenario, I compiled an easy example to illustrate how to use svm in full
process. The code contains:
data generation
determining train and test data set
parameter selection using n-fold cross validation, both semi-manual and the automatic approach
train the svm model using one-versus-rest (OVR) approach
use the svm model to classify the test set in OVR mode
make confusion matrix to evaluate the results
show the results in an informative way
display the decision boundary on the feature space
Reporting a results using n-fold cross validation: In case you have only 1 data set (i.e., there is no explicit train or test set), n-fold cross validation is a conventional way to assess a classifier. The overall accuracy is obtained
by averaging the accuracy per each of the n-fold cross validation. The observations are separated into n folds equally, the code use n-1 folds to train the svm model which will be used to classify the remaining 1 fold according to standard OVR. The code can
be found here.
Using multiclass ovr-svm with kernel: So far I haven't shown the usage of ovr-svm with kernel specific ('-t x'). In fact, you can add the kernel to any ovr code, they will work. The complete code can be found here.
For parameter selection using cross validation, we use the code below to calculate the average accuracy cv. You can just add
'-t x'to the code.
cmd = ['-q -c ', num2str(2^log2c), ' -g ', num2str(2^log2g),' -t 0'];
cv = get_cv_ac(trainLabel, [(1:NTrain)' trainData*trainData'], cmd, Ncv);
Training: just add
'-t x'to the training code
bestParam = ['-q -c ', num2str(bestc), ', -g ', num2str(bestg),' -t 0'];
model = ovrtrainBot(trainLabel, [(1:NTrain)' trainData*trainData'], bestParam);
Classification: the
'-t x'is included in the variable
modelalready, so you don't need to specify
'-t x'again when classifying.
[predict_label, accuracy, decis_values] = ovrpredictBot(testLabel, [(1:NTest)' testData*trainData'], model);
[decis_value_winner, label_out] = max(decis_values,[],2);
However, I found that the code can be very slow in parameter selection routine when the number of class and the number of cross validation are big (e.g., Nclass = 10, Ncv=3). I think the slow part might be caused by
[(1:NTrain)' trainData*trainData']which can be huge. Personally I like to use the default kernel (RBF), which we don't need to make the kernel matrix X*X', which might contribute to a pretty quick speed.
Complete example for classification using n-fold cross validation: This code works on the single data where the train and test set are combined within one single set. More details can be found here.
Complete example for classification using train and test data set separately: This code works on the data set where the train and test set are separated, that is, train the model using train set and use the model to classify the
test set. More details can be found here.
How to obtain the SVM weight vector w: Please see the example code and discussion from StackOverflow.
List of available matlab codes
code | binary/multiclass | parameter selection | classification separated/n-fold | kernel | data set | description |
demo_libsvm_test1.m | binary | no, manually | separated | default (RBF) | heart_scale | This code shows the simple (perhaps simplest) usage of the svmlib package to train and classify. Very easy to understand. This code just simply run the SVM on the example data set "heart_scale", which is scaled properly. The code divides the data into 2 parts train: 1 to 200 and test: 201:270 Then plot the results vs their true class. In order to visualize the high dimensional data, we apply MDS to the 13D data and reduce the dimension to 2D |
demo_libsvm_test2.m | binary | no, manually | separated | Specified | heart_scale | Identical to _test1 except that it shows how to specify the kernel (e.g., '-t 4') in the code. |
demo_libsvm_test3.m | binary | semi-automatic, but the code is still not compact | separated | default | heart_scale | Identical to _test1 except that it include a routine searching for good parameters c and gamma |
demo_libsvm_test4.m | multiclass, OVR | semi-automatic | separated | default | dna_scale | This code shows how to use the libsvm for the multiclass, more specifically one-vs-rest (OVR), scenario. For both training and classifying, we adopt the OVR wrapper codes posted in the libsvm website: ovrtrain.m and ovrpredict.m respectively. |
demo_libsvm_test5.m | multiclass, OVR | multi-scale automatic but not perfect | separated | default | 10-class spiral | Here both the train and test set are generated from 10-class spiral made available here. The data set is very intuitive. In this code, we also make a routine to determine the optimal parameters automatically. The user can guess an initial parameter, the routine will keep improving it. Here we also modify the original train and classify function a bit: ovrtrainBot.m <-- ovrtrain.m ovrpredictBot.m <-- ovrpredict.m Furthermore, the confusion matrix is shown in the results. We also plot the decision values in the feature space just to give an idea how the decision boundary looks like. |
demo_libsvm_test6.m | multiclass, OVR | no, manually | leave-one-out n-fold cross validation | default | 10-class spiral | In this code we want to illustrate how to perform classification using n-fold cross validation, which is a common methodology to use when the data set does not have explicit training and test set separately. Such data sets usually come as a single set and we will need to separate them into n equal parts/folds. The leave-one-out n-fold cross validation is to classify observations in a fold k by using the model trained from {all}-{k} models, and repeat the process for all k. The user is required to separate the data into n folds by assigning "run" label for each observation. The observations with identical run number will be grouped together into a fold. It is a preference to have observations from all the classes within a certain fold. In fact, assigning the run number to each observation randomly is fine as well. |
demo_libsvm_test7.m | multiclass, OVR | multi-scale automatic, quite perfect | separated | default and specific are fine here | 10-class spiral | This code is developed based on _test5. What we add are: better automatic cross validation routine than _test5.m kernel-specific code snippet We found that having kernel-specific is much slower than using the default (without '-t x'). At this point, I prefer using the default kernel. |
demo_libsvm_test8.m | multiclass, OVR | multi-scale automatic, quite perfect | separated | default and specific are fine here | 10-class spiral | The code is developed based on _test7. The improvement is that the automatic cross validation for parameter selection is made into a function, which is much more convenient. The function is automaticParameterSelection |
demo_libsvm_test9.m | multiclass, OVR | multi-scale automatic, quite perfect | leave-one-out n-fold cross validation | default | 10-class spiral | This code is an excellent example complete code for classification using n-fold cross validation and automatic parameters selection. The code is developed based on _test8. The difference is we put the n-fold classification (from _test6) into a function: classifyUsingCrossValidation |
demo_libsvm_test10.m | multiclass, OVR | multi-scale automatic, quite perfect | separated | default | 10-class spiral | This code is an excellent example complete code for classification on strain-test_separated data set and automatic parameters selection. The code is developed based on _test8 and _test9. |
demo_libsvm_test11.m | multiclass, OVR | multi-scale automatic, quite perfect | separated | default and specific are fine here | 3-class ring | This code is developed based on -test10, except that the code is made to work for any kernel. However, the results are not good at all. Moreover, the run time is not good either. We found a better way using multiclass pair-wise SVM, which is the default multiclass SVM approach in the libsvm package. In the next version (_test12), we will test the pair-wise SVM. |
demo_libsvm_test12.m | multiclass, pair-wise (default method for multiclass in the libsvm package) | multi-scale automatic, quite perfect | separated | default and specific kernel are fine here. | 4-class spiral | The code is developed based on _test11. I figure that the function svmtrain and svmpredict, originally implemented in libsvm, support multiclass pair-wise SVM. We don't even need to make the kernel matrix ourself, we you need to do is just pick your kernel '-t x', parameters '-c y -g z', and you will get the results. With this regard, I make another version of parameter selection routine using cross validation: automaticParameterSelection2.m <only slightly different from automaticParameterSelection.m> which call the n-fold cross validation classification routine: svmNFoldCrossValidation.m I would say this is the best so-far code to run on separated data set as it provides parameter selection routine and the train and classification routines. Very easy to follow. |
demo_libsvm_test13.m | multiclass, pair-wise | multi-scale automatic, quite perfect | leave-one-out n-fold cross validation | default and specific kernel are fine here. | 4-class ring | The code is developed based on _test12. The only difference is that this code use n-fold cross validation when classifying the "single" data set, i.e., the data set where both train and test set are combine together--often found when the number of observations is limited. This is the best code to use to run on the single data set using n-fold cross validation classification. |
相关文章推荐
- libsvm for matlab安装与测试
- LIBSVM简介
- 【机器学习】libsvm一般使用步骤
- Felomeng翻译:libsvm2.88之“svm-predict”的使用
- LibSVM学习(六)——easy.py和grid.py的使用
- LIBSVM2.83软件包的介绍和移植(vc版本)
- Weka下LibSVM (WLSVM)的配置
- 用LIBSVM做回归和预测的步骤
- [转载]LibSvm 使用说明 学习心得
- windows下LIBSVM使用方法及例子
- libsvm 2.6 的代码注释
- libsvm格式学习
- svm(libsvm)在文本分类中的应用
- [zz]使用vc编译libsvm
- libsvm 接口 [供自己的C/C++程序训练预测用]
- LibSVM学习(三)——LibSVM使用规范
- libsvm 线性核 C-SVM 参数寻优
- eclipse + libsvm-3.12 用SVM实现简单线性分类
- 使用libSVM
- Weka中使用libsvm